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Abstract.  The  satisfiability  (SAT)  problem  is  a  core  problem  in  mathemat¬ 
ical  logic  and  computing  theory.  In  practice,  SAT  is  fundamental  in  solving 
many  problems  in  automated  reasoning,  computer-aided  design,  computer- 
aided  manufacturing,  machine  vision,  database,  robotics,  integrated  circuit 
design,  computer  architecture  design,  and  computer  network  design.  Tradi¬ 
tional  methods  treat  SAT  as  a  discrete,  constrained  decision  problem.  In 
recent  years,  many  optimization  methods,  parallel  algorithms,  and  practical 
techniques  have  been  developed  for  solving  SAT.  In  this  surv'ey,  we  present 
a  general  framework  (an  algorithm  space)  that  integrates  existing  SAT  algo¬ 
rithms  into  a  unified  perspective.  We  describe  sequential  and  parallel  SAT 
algorithms  including  variable  splitting,  resolution,  local  search,  global  opti¬ 
mization,  mathematical  programming,  and  practical  SAT  algorithms.  We  give 
performance  evaluation  of  some  existing  SAT  algorithms.  Finally,  we  provide 
a  set  of  practical  applications  of  the  satisfiability  problems. 


1.  Introduction 

An  instance  of  the  satisfiability  (SAT)  problem  is  a  Boolean  formula  that  has 
three  components  [101,  188]: 

•  A  set  of  n  variables:  xi,  X2,  ajn- 

•  A  set  of  literals.  A  literal  is  a  variable  {Q  =  x)  or  a  negation  of  a  variable 
{Q  =  x). 

•  A  set  of  m  distinct  clauses:  Ci,  C2,  Cm-  Each  clause  consists  of  only 
literals  combined  by  just  logical  or  (V)  connectives. 

The  goal  of  the  satisfiability  problem  is  to  determine  whether  there  exists  an 
assignment  of  truth  values  to  variables  that  makes  the  following  Conjunctive  Normal 
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Form  {CNF)  formula  satisfiable: 

(1.1)  Cl  A  C2  A  •  *  •  A  Cmi 

where  A  is  a  logical  and  connective. 

The  SAT  problem  is  a  core  of  a  large  family  of  computationally  intractable  NP- 
complete  problems  [101,  188].  Such  NP-complete  problems  have  been  identified 
as  central  to  a  number  of  areas  in  computing  theory  and  engineering.  Since  SAT 
is  NP-complete,  it  is  unlikely  that  any  SAT  algorithm  has  a  fast  worst-case  time 
behavior.  However,  clever  algorithms  can  rapidly  solve  many  SAT  formulas  of 
practical  interest.  There  has  been  great  interest  in  designing  efficient  algorithms  to 
solve  most  SAT  formulas. 

In  practice,  SAT  is  fundamental  in  solving  many  problems  in  automated  reason¬ 
ing,  computer-aided  design,  computer-aided  manufacturing,  machine  vision,  data¬ 
base,  robotics,  integrated  circuit  design  automation,  computer  architecture  design, 
and  computer  network  design  (see  Section  14).  Therefore,  methods  to  solve  SAT 
formulas  play  an  important  role  in  the  development  of  efficient  computing  sy steins. 

Traditional  methods  treat  a  SAT  formula  as  a  discrete,  constrained  decision 
problem.  In  recent  years,  many  optimization  methods,  parallel  algorithms,  and 
practical  techniques  have  been  developed.  In  this  survey,  we  present  a  general 
framework  (an  algorithm  space)  that  integrates  existing  SAT  algorithms  into  a  uni¬ 
fied  perspective.  We  describe  sequential  and  parallel  SAT  algorithms  and  compare 
the  performance  of  major  SAT  algorithms  including  variable  setting,  resolution, 
local  search,  global  optimization,  mathematical  programming,  and  practical  SAT 
algorithms.  At  the  end  of  this  survey,  we  give  a  collection  of  practical  applications 
of  the  satisfiability  problem. 

The  rest  of  the  paper  is  organized  as  follows. 

1.  Introduction 

2.  Constraint  Satisfaction  Problems 

3.  Preliminaries 

4.  An  Algorithm-Space  Perspective  of  SAT  Algorithms 

5.  SAT  Input  Models 

6.  Splitting  and  Resolution 

7.  Local  Search 

8.  Global  Optimization 

9.  Integer  Programming  Method 

10.  Special  Subclasses  of  SAT 

11.  Advanced  Techniques 

12.  Probabilistic  and  Average-Case  Analysis 

13.  Performance  and  Experiments 

14.  Applications 

15.  Future  Work 

16.  Conclusions 

In  the  next  section,  we  describe  the  constraint  satisfaction  problem  (CSP)  and  its 
close  relationship  to  the  SAT  problem.  Section  3  gives  preliminaries  for  the  paper. 
In  Section  4,  we  give  a  general  framework  (an  algorithm  space)  that  puts  existing 
SAT  algorithms  into  a  unified  perspective.  This  is  followed  by  a  brief  overview  of 
the  basic  SAT  algorithm  classes  and  a  discussion  of  the  general  performance  eval¬ 
uation  approaches  for  S.4T  algorithms.  In  Section  5,  some  SAT  problem-instance 
models  are  given.  Section  6  describes  the  variable  setting  and  resolution  procedures 
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for  solving  SAT  formulas.  Local  search  algorithms,  global  optimization  techniques 
and  integer  programming  approaches  for  solving  SAT  formulas  are  discussed,  re¬ 
spectively,  in  Sections  7,  8,  and  9,  Section  10  discusses  special  subclasses  of  the 
SAT  problem.  Some  advanced  techniques  for  solving  SAT  formulas  are  described 
in  Section  11.  Section  12  gives  probabilistic  and  average-case  analysis  of  the  SAT 
problem. 

Experimental  results  and  performance  comparisons  of  several  major  SAT  al¬ 
gorithms  are  given  in  Section  13.  Presently  for  hard  random  3-SAT  problem  in¬ 
stances,  a  complete  SAT  algorithm  could  solve  a  SAT  problem  with  a  few  hundred 
variables.  An  incomplete  SAT  algorithm  such  as  WSAT  can  solve  SAT  problem 
instances  with  2,000  variables  on  an  SGI  Challenge  with  a  70  MHz  MIPS  R4400 
processor  [472,  471],  The  randomized  local  search  algorithm,  e.g.,  SAT  1.5,  can 
solve  various  SAT  problem  instances  with  over  10,000  variables  on  a  SUN  SPARC 
20  workstation  comfortably  [211,  212,  222].  Most  practical  SAT  solvers  used  in 
industrial  applications  are  problem  specific.  We  collected  some  real  experimental 
results  in  Section  13.  Section  14  summarizes  some  applications  of  the  SAT  prob¬ 
lem.  Future  work  for  SAT  research  is  discussed  in  Section  15.  Finally,  Section  16 
concludes  the  paper. 


2.  Constrciint  Satisfaction  Problems 

A  constraint  satisfaction  problem  (CSP)  is  to  determine  whether  a  set  of  con¬ 
straints  over  discrete  variables  can  be  satisfied.  Each  constraint  must  have  a  form 
that  is  easy  to  evaluate,  so  any  difficulty  in  solving  such  a  problem  comes  from 
the  interaction  between  the  constraints  and  the  need  to  find  a  setting  for  the  vari¬ 
ables  that  simultaneously  satisfies  all  the  constraints  [430].  In  a  SAT  formula,  each 
constraint  is  expressed  as  a  clause,  making  SAT  a  special  case  of  the  constraint  sat¬ 
isfaction  problem  (see  Figure  1).  Due  to  this  close  relationship,  any  CSP  algorithm 
can  be  transformed  into  a  SAT  algorithm,  and  this  can  usually  be  done  in  a  way 
that  maintains  the  efficiency  of  the  algorithm. 

A  discrete  CSP  model  consists  of  the  following  three  components  [206,  226]: 

•  n  variables:  {xi ,  0:2,  . . . ,  Xn}.  An  assignment  is  a  tuple  of  n  values  assigned 
to  the  n  variables. 

•  n  domains:  {Di,  D2,  • .  • ,  Dn}.  Domain  Di  contains  d  possible  values  (also 

called  labels)  that  Xi  may  be  instantiated,  i.e.,  Di  =  •••»  h,d}- 

•  A  subset  of  Di  x  £>2  x  . . .  x  Dn  is  a  set  of  constraints.  A  set  of  order-l 

constraints  {I  <  n)  imposed  on  a  subset  of  variables  ,Xi^}  C 

{xi ,  X2  j  *  •  •  is  denoted  as 

Gil, 12,...  iU  —  ^*1  ^  -^12  X  ...  X  Di^ . 

An  order-/  constraint  indicates  the  compatibility  (z.e.,  consistency/inconsistency 
or  conflicting  measure)  among  I  variables  for  a  given  variable  assignment.  The 
variables  conflict  if  their  values  do  not  satisfy  the  constraint.  In  practice,  two 
frequently  used  constraints  are  unary  constraints  imposed  on  a  single  variable  {Ci  C 
Di)  and  binary  constraints  imposed  on  a  pair  of  variables  {Cij  C  Di  x  Dj). 

Solving  a  CSP  entails  minimizing  local  inconsistency  and  finding  a  consistent 
value  assignment  (z.e.,  a  consistent  labeling)  to  the  variables  subject  to  the  given 
constraints. 

Constraint  satisfaction  problems  are  extremely  common.  Most  NP-complete 
problems  are  initially  stated  as  constraint  satisfaction  problems.  Indeed,  the  proof 
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iV-queen  problem 
Graph  coloring  problem 

SAT  problem 
Max-SAT  problem 

Figure  1.  Some  examples  of  the  constraint  satisfaction  problem 
(CSP).  SAT  problem  a  special  case  of  CSP,  i.e.,  a  CSP  with  binary 
values. 

that  a  problem  is  NP-complete  implies  an  efficient  way  to  transform  the  problem 
into  a  constraint  satisfaction  problem.  For  a  few  special  forms  of  the  constraint 
satisfaction  problem  there  exist  algorithms  that  solve  such  formulas  in  polynomial 
worst-case  time.  When  no  polynomial-time  algorithm  is  known  for  a  particular  form 
of  constraint  satisfaction  problem,  it  is  common  practice  to  solve  such  formulas  with 
a  search  algorithm. 

Problems  that  are  commonly  formulated  cis  constraint  satisfaction  or  satisfia¬ 
bility  problems  for  the  purposes  of  benchmarking  include  graph  coloring  and  the 
n-queens  problems.  In  the  case  of  the  n-queens  problem,  although  analytical  solu¬ 
tions  for  this  problem  exist  [2,  10,  30],  they  provide  a  restricted  subset  of  solutions. 
In  practical  applications,  one  must  use  a  search  algorithm  to  find  a  general  solution 
to  the  CSP  or  SAT  problems. 


CSP 


Discrete 

CSP 

Binary 
CSP  ^ 


3.  Preliminaries 

To  simplify  our  discussion,  throughout  this  paper,  let: 

•  !F  he  a,  CNF  Boolean  formula, 

•  m  be  the  number  of  clauses  in 

•  n  be  the  number  of  variables  in 

•  Ci  be  the  ith  clause, 

•  \Ci\  be  the  number  of  literals  in  clause  Ci, 

•  Qi  i  be  the  jth  literal  in  the  zth  clause,  and 

^  I G  •  1 

•  I  be  the  average  number  of  literals:  — 


where  i  =  1, ...,  m  and  j  =  1, n. 

On  Boolean  space  {0, 1}”,  let: 

•  F(x)  be  a  function  from  {0, 1}”  to  integer  iV, 

•  Xj  be  the  jth  variable, 

•  X  be  a  vector  of  n  variables, 

•  C'i(x)  be  the  zth  clause  function,  and 

•  Qij(x)  be  the  jth  literal  function  of  the  zth  clause  function, 

where  i  =  1,  ...,m  and  j  =  1, 

On  real  space  let: 

•  A^(x)  be  a  real  function  from  {0, 1}'^  to  E, 

•  /(y)  be  a  real  function  from  E^  to  E, 

•  2/j  be  the  jth  variable, 

•  y  be  a  vector  of  n  variables, 
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•  Ci(y)  be  the  ith  clause  function,  and 

•  Qijiy)  the  jth  literal  function  of  the  zth  clause  function, 

where  i  =  1,  ...,m  and  j  =  1, 

On  real  space  E^,  also  let: 

•  Wj  be  the  jth  integer  variable,  and 

•  w  be  a  vector  of  n  integer  variables, 

where  i  =  1, m  and  j  =  1, n. 

Following  [356],  a  real- valued  function  /  defined  on  a  subset  of  is  said  to 
be  continuous  at  y  if  /(yjt)  /(y).  A  set  of  real-valued  functions  /i,  /2,  fm 
on  E^  form  a  vector  function  f  =  (/i, /2,  •••7 /m)  whose  ith  component  is  fi.  It  is 
continuous  if  each  of  its  component  functions  is  continuous. 

If  /  has  second  partial  derivatives  which  are  continuous  on  this  set,  we  define 
the  Hessian  of  /  at  y  to  be  the  n  xn  matrix  denoted  by 

d^fjyy  ^ 

dyidyj_  * 

We  call  y  e  E^  with  /(y)  =  0  a  solution  of  /,  denoted  as  y*. 

Two  aspects  of  iterative  optimization  algorithms  are  their  global  convergence 
and  local  convergence  rates  [356].  Global  convergence  concerns,  starting  from  an 
initial  point,  whether  the  sequence  of  points  will  converge  to  the  final  solution 
point.  Local  convergence  rate  is  the  rate  at  which  the  generated  sequence  of  points 
converge  to  the  solution. 


(3.1)  H(y)  =  VV(y)  = 


4.  An  Algorithm-Space  Perspective  of  SAT  Algorithms 

In  this  section,  we  first  describe  various  formulations  of  SAT,  then  give  an 
algorithm-space  perspective  that  provides  some  insights  into  developing  efficient 
algorithms  for  solving  SAT,  Following  this,  we  give  a  brief  overview  of  the  basic 
sequential  and  parallel  SAT  algorithms,  and  discuss  various  categories  of  algorithms 
and  performance  evaluation  methods. 

4.1.  Formulations  of  SAT.  SAT  problem  can  be  expressed  by  Conjunctive 
Normal  Form  {CNF)  formulas  (e.g.,  (xiVx2)A(xi  VX2))  or  Disjunctive  Normal  Form 
{DNF)  formulas  (e.g.,  {xi  A  X2)  V  (x  1  A  X2)).  Instances  of  SAT  can  be  formulated 
based  on  discrete  or  continuous  variables  [535,  537], 


Discrete  Formulations.  These  can  be  classified  as  unconstrained  versus  con¬ 
strained. 

(a)  Discrete  Constrained  Feasibility  Formulations,  The  goal  is  to  satisfy  all 
constraints.  One  possible  formulation  is  the  CNF  formulas  given  by  (1-1).  A  second 
formulation  is  the  DNF  formulas  [207]  discussed  in  Section  7.10. 

(b)  Discrete  Unconstrained  Formulations.  A  common  formulation  for  CNF 
formulas  exists  [211,  212,  469].  The  goal  is  to  minimize  N(x),  the  number  of 
unsatisfied  clauses,  under  the  interpretation  that  numeric  variable  Xj  =  1  (xj  =  0) 
if  Boolean  variable  xi  =  true  {xi  =  false),  respectively.  That  is, 


min 

xE{0,l}" 


7V(x)  =  y]Ci(x) 
1=1 


(4.1) 
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where 

(4.2) 

j=l 

(  1-Xj  if  Xj  in  Ci 

(4.3) 

QiA^j)  =  S  xj  in  Ci 

[  1  otherwise 

In  this  case,  iV(x)  =  0  when  all  the  clauses  are  satisfied. 

A  similar  formulation  for  I)iVF formulas  exists  (See  Section  7.10,  [234],  [207],  [217]). 
Under  the  interpretation  that  numeric  variable  Xi  =  1  (xj  =  0)  if  Boolean  variable 
Xi  =  true  (xi  =  false),  respectively,  the  goal  is  to  solve 


(4.4) 


m 


min 

x6{0,l}” 


Fix)  =Y^Ciix), 

i=l 


where 

71 

(4.5) 

Ci(x)  =  1  -  Qt,j(xj) 

j=i 

(  Xj  if  Xj  in  Ci 

(4.6) 

Qiji^j)  =  <  ifijinCi 

1  1  Otherwise 

All  the  clauses  are  satisfied  when  F{x)  =  0. 

Alternatively,  DNF  formulas  can  be  solved  as  follows: 


(4.7) 


771 


max 

xG{0,l}" 


F{x)  =  ^C'i(x), 


where 

n 

(4.8)  C^x)  =  ]j[<5ij(xj), 


and  Qi,j{xj)  is  given  by  (4.6).  , 

Usually,  the  question  of  falsifiability  for  a  DNF  formula  is  more  interesting  than 

the  question  of  satisfiability.  This  can  be  solved  as  follows: 


(4.9) 


min  Fix)  =  ^  C'i(x), 
xelo,!}" 


where  Ci(x)  is  given  by  (4.8).  A  formula  is  falsifiable  if  F(x)  —  0  for  some  x. 

(c)  Discrete  Constrained  Formulations.  There  are  various  forms  of  this  formu¬ 
lation.  One  approach  is  to  formulate  SAT  formulas  as  instances  of  the  0-1  integer 

linear  programming  (ILP)  problem.  ,  ,  ,  u  r 

Another  approach  is  to  minimize  the  objective  function  Nix),  the  number  ot 
unsatisfiable  clauses,  subject  to  a  set  of  constraints,  as  follows  [535,  537]: 

m 

(4.10)  minxe{o,i}’*  A^(x)  =  ^  Ci(x) 

i=l 

subject  to  C'i(x)  =  0  Vi  S  {1,2, ...,7n}. 

A  formulation  based  on  DNF  can  be  defined  similarly. 
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This  formulation  uses  additional  constraints  to  guide  the  search.  The  violated 
constraints  provide  another  mechanism  to  bring  the  search  out  of  a  local  minimum. 
This  formulation  is  used  in  a  Lagrange  multiplier-based  method  to  solve  the  SAT 
problem  (see  Section  8.7  and  [535,  537]). 

Continuous  Formulations.  In  formulating  a  discrete  instance  of  SAT  in 
continuous  space,  we  transform  discrete  variables  in  the  original  formula  into  con¬ 
tinuous  variables  in  such  a  way  that  solutions  to  the  continuous  problem  are  binary 
solutions  to  the  original  formula.  Such  a  transformation  is  potentially  beneficial 
because  an  objective  in  continuous  space  may  “smooth  out”  some  infeasible  solu¬ 
tions,  leading  to  a  smaller  number  of  local  minima  explored.  In  the  following,  we 
show  two  such  formulations. 

(a)  Continuous  Unconstrained  Formulations,  There  are  many  possible  formu¬ 
lations  in  this  category.  A  simple  formulation,  UniSAT  (Universal  SAT  Problem 
Models)  [207,  211,  210,  209],  suggests: 

m 

(4.11)  min  /(y)  =  ^  Ci(y), 

where 


n 


(4.12) 

ci(y) 

7“1 

J  ^ 

f  \yj-T\ 

if  Xj  in  Ci 

(4.13) 

Qi,j  iuj  ) 

=  {  \yj  +  F\ 

if  Xj  in  Ci 

\  1 

otherwise 

where  T  and  F  are  positive  constants. 

Two  special  formulations  to  (4.13)  exist.  In  the  UniSATS  model  [210,  217] 

{\yj  —  Ij  if  Xj  is  in  Ci 

\yj  +  ij  if  is  in  Ci 

1  otherwise 

and  in  the  UniSATl  model  [210,  217]: 

{{yj  -  1)^  if  Xj  is  in  Ci 

Ivi  +  1)^  if  is  in  Ci 

1  Otherwise 

Values  of  y  that  make  /(y)  =  0  are  solutions  to  the  original  formula  in  (1.1). 
UniSAT5  can  be  solved  with  efficient,  discrete,  greedy  local  search  algorithms  (Sec¬ 
tion  8  and  [217]).  UniSAT?  requives  computationally  expensive  continuous  opti¬ 
mization  algorithms,  rendering  them  applicable  to  only  small  formulas  (Section  8 
and  [217,  227]). 

(b)  Continuous  Constrained  Formulations.  This  generally  involves  a  heuristic 
objective  function  that  measures  the  quality  of  the  solution  obtained  (such  as  the 
number  of  clauses  satisfied).  One  formulation  similar  to  (4.11)  is  as  follows. 

m 

(4.16)  minygf;..  /(y)  =  ^Cj(y) 

subject  to  Ci(y)  =  0  Vi  E  {1,2, . . . ,  m} 


ALGORITHMS  FOR  THE  SATISFIABILITY  (SAT)  PROBLEM:  A  SURVEY 


7 


Unconstrained 


Discrete 


Figure  2.  The  algorithm  space  is  a  unified  framework  for  discrete 
search  algorithms  and  continuous  optimization  algorithms.  The 
octants  represent  eight  basic  classes  of  algorithms. 


where  Cf(y)  is  defined  in  (4.12). 

The  key  in  this  approach  lies  in  the  transformation.  When  it  does  not  smooth 
out  local  minima  in  the  discrete  space  or  when  solution  density  is  low,  continuous 
methods  are  much  more  computationally  expensive  to  apply  than  discrete  methods. 

Since  (4.16)  is  a  continuous  constrained  optimization  problem  with  a  nonlin¬ 
ear  objective  function  and  nonlinear  constraints,  we  can  apply  existing  Lagrange- 
multiplier  methods  to  solve  it.  Our  experience  is  that  a  Lagrangian  transformation 
does  not  reduce  the  number  of  local  minima,  and  continuous  Lagrangian  methods 
are  at  least  an  order-of-magnitude  more  expensive  to  apply  than  the  corresponding 
discrete  algorithms  [80]. 

4.2.  The  Algorithm  Space.  Discrete  search  algorithms  relate  to  continuous 
optimization  methods  in  operations  research.  Many  discrete  search  problems  can 
be  solved  with  numerical  algorithms  in  the  real  space.  A  unified  framework  for 
search  and  optimization  would  shed  light  on  developing  efficient  algorithms  for  a 
search  problem.  Figure  2  shows  a  typical  algorithm  space  that  unifies  a  variety  of 
search  and  optimization  algorithms  in  terms  of  variable  domain,  constraint  used, 
and  parallelism  in  the  algorithms  [217]. 

Satisfiability  is  expressed  with  discrete  variables,  but  some  algorithms  do  their 
calculations  with  continuous  variables.  This  leads  to  the  discrete-continuous  axis  in 
the  space.  Satisfiability  has  a  set  of  constraints  that  must  be  satisfied  exactly,  but 
some  procedures  (e.g.,  local  search)  consider  changes  in  variable  values  in  clauses 
that  do  not  satisfy  the  constraints  (typically,  these  algorithms  assign  some  cost  to 
non-satisfying  constraints  and  then  look  for  the  least-cost  solution).  This  defines 
the  vertical  axis  in  Figure  2  showing  constraint  characteristics  in  the  algorithm 


8 


JUN  GU,  PAUL  W.  PURDOM,  JOHN  FRANCO,  AND  BENJAMIN  W.  WAH 


Discrete 


Unconstrained 


local 

search 

unconstrained 

optimization 

consistency 

linear 

checking 

programming 

1 

Continuous 


Constrained 


Figure  3.  A  2-dimensional  cross  section  of  the  algorithm  space 
cut  at  the  sequential  side.  It  indicates  a  unified  framework  for  some 
discrete  search  algorithms  and  continuous  optimization  techniques 
for  solving  SAT. 


space.  Most  SAT  algorithms  are  sequential,  while  some  have  been  implemented 
in  parallel.  A  third  axis  indicating  parallelism  in  the  algorithms  is  added  in  the 
algorithm  space.  Following  the  three  axes,  the  algorithm  space  is  divided  into  eight 
octants,  representing  the  four  sequential  algorithm  classes,  i.e.,  discrete  constrained 
algorithms,  discrete  unconstrained  algorithms,  continuous  constrained  algorithms, 
and  continuous  unconstrained  algorithms,  and  four  parallel  algorithm  classes,  i.e., 
parallel  discrete  constrained  algorithms,  parallel  discrete  unconstrained  algorithms, 
parallel  continuous  constrained  algorithms,  and  parallel  continuous  unconstrained 
algorithms. 

Figure  3  gives  some  typical  examples  for  the  four  sequential  classes  of  SAT 
algorithms  in  the  space  In  the  discrete  search  space  (left  half  of  Figure  3),  vari¬ 
ables,  values,  constraints,  and  the  objective  functions  are  defined  with  discrete 
values.  If  one  handles  a  discrete  search  problem  with  consistency  checking  or  con¬ 
straint  resolution,  the  approach  belongs  to  the  class  of  discrete  constrained  meth¬ 
ods  [226,  358,  381,  539].  Alternatively,  one  can  formulate  the  constraints  into 
an  objective  function  and  minimize  the  objective  function  without  looking  at  any 
problem  constraints.  Algorithms  in  this  category  are  usually  called  the  discrete, 
unconstrained  methods  such  as  local  search  procedure.  [211,  212,  400,  484,  488]. 

In  the  continuous  search  space  (right  half  of  Figure  3),  variables,  values,  con¬ 
straints,  and  objective  functions  are  defined  quantitatively  with  real  values.  If  one 
solves  a  continuous  optimization  problem  with  explicit  constraints,  one  uses  contin¬ 
uous  constrained  methods,  such  as  constrained  minimization,  primal  methods,  and 
cutting  plane  methods  [356].  If  the  problem  constraints  are  incorporated  into  an 
objective  function,  then  the  problem  is  transformed  into  an  unconstrained  one.  The 
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Unconstrained 


Discrete 


Figure  4.  An  algorithm  space  incorporating  algorithm  complete¬ 
ness  for  solving  SAT.  Each  octant  represents  one  class  of  SAT  al¬ 
gorithms. 

latter  can  be  solved  by  the  continuous  unconstrained  methods^  such  as  the  descent 
methods,  conjugate  direction  methods,  and  Newton  methods  [217,  216,  356]. 

From  an  operations  research  point  of  view,  most  discrete  search  algorithms 
have  continuous  optimization  versions,  and  most  constrained  search  methods  have 
unconstrained  counterparts.  For  instance,  discrete  consistency  algorithms  are  con¬ 
strained  algorithms.  If  we  formulate  the  amount  of  “inconsistency”  into  an  objec¬ 
tive  function,  a  local  search  method  can  often  be  used  to  solve  an  input  efficiently. 
Furthermore,  local  search  w'orks  in  discrete  search  space.  By  extending  a  search 
problem  into  a  real  search  space,  constrained  and  unconstrained  global  optimization 
algorithms  can  be  developed  to  solve  SAT  [35,  216,  217,  257,  284,  283]. 

The  algorithm  space  provides  a  unified  and  global  perspective  on  the  develop¬ 
ment  of  search  and  optimization  algorithms  for  solving  SAT.  In  general,  for  a  given 
instance  of  a  search  problem  if  one  can  find  an  algorithm  in  one  octant,  then  one 
could  possibly  find  some  closely  related  algorithms  in  other  octants.  In  the  left  two 
quadrants  in  Figure  3,  for  example,  once  we  had  consistency  algorithms  and  local 
search  algorithms  for  solving  SAT,  it  would  be  natural  to  think  about  unconstrained 
optimization  algorithms  for  solving  SAT  in  the  right  two  quadrants  —  something 
must  be  put  there  to  meet  the  natural  symmetry.  This  was  the  original  incentive 
to  develop  unconstrained  optimization  algorithms  for  solving  SAT  [217]. 

There  are  other  ways  of  looking  at  a  variety  of  SAT  algorithms.  A  different 
algorithm  space  for  SAT  that  incorporates  algorithm  completeness  was  given  in 
[217]  (see  Figure  4). 

4,3.  Basic  SAT  Algorithm  Classes.  Following  the  algorithm  space,  a  num¬ 
ber  of  major  SAT  algorithm  classes  can  be  identified.  They  are  given  in  Figure  5 
in  chronological  order.  Most  existing  SAT  algorithms  can  be  grouped  into  these 
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Discrete  < 


Constrained  < 


1960:  Davis-Putnam  (DP)  algorithm  [118] 

1965:  Resolution  [446] 

1971:  Consistency  algorithms  [95,  206,  270,  538] 
1978:  Loveland’s  Davis-Putnam  (DPL)  [117,  354] 
1986:  Parallel  consistency  chips  [206,  224,  225] 
1986:  Binary  decision  diagrams  (BDD)  [17,  59] 
1988:  Chip  and  conquer  [192] 

1990:  DPL  plus  heuristic  (DPLH)  [283] 

1989:  Local  search  &  backtracking  [212] 

1993:  Backtracking  and  probing  [430] 

1994:  Parallel  DP  algorithm  [38] 

1994:  Matrix  inequality  system  [509] 

1996:  CSAT[151] 


Unconstrained 


1987:  Randomized  Local  search  (SATl)  [207,  212] 
1987:  Parallel  local  search  (SAT1.6)  [207,  212] 
1988:  Local  search  for  n-queen  [206,  481,  482] 
1990:  Unison  algorithm  and  hardware  [489,  490] 
1991:  Local  search  complexity  [220,  400] 

1991:  Local  search  for  2-SAT  [399] 

1992:  Local  search  with  traps  (SATl. 5)  [211,  212] 
1992:  Greedy  local  search  -  GSAT  [469] 


Constrained 


1986:  Branch-and-bound  (APEX)  [35] 
1988:  Programming  models  [35,  284] 

^  1988:  Cutting  plane  [258,  256] 

1989:  Branch-and-cut  [259] 

1989:  Interior  point  method  [301,  299] 


Continuous  < 


Unconstrained 


'  1987:  models  [207,  210,  217] 

1987:  Global  optimization  (SAT6)  [207,  217] 

<  1989:  Neural  net  models  [290,  75] 

1990:  Global  optimization  k.  backtracking  [217] 
1991:  SAT14  algorithms  [217] 


Figure  5.  Some  typical  algorithms  for  the  SAT  problem. 


categories. 

•  Discrete,  constrained  algorithms.  Algorithms  in  this  category  treat  a  SAT 
formula  as  an  instance  of  a  constrained  decision  problem,  applying  discrete 
search  and  inference  procedures  to  determine  a  solution.  One  straightfor¬ 
ward  way  to  solve  an  instance  of  SAT  is  to  enumerate  all  possible  truth 
assignments  and  check  to  see  if  one  satisfies  the  formula.  Many  improved 
techniques,  such  as  consistency  algorithms  [226,  358],  backtracking  algo¬ 
rithms  [34,  53,  64,  326,  422],  term-rewriting  [130,  267],  production  sys¬ 
tem  [479],  multi-valued  logic  [475],  Binary  Decision  Diagrams  [59,  17],  chip 
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and  conquer  [192],  resolution  and  regular  resolution  [195,  354,  394,  446, 
511,  522,  546],  independent  set  algorithm  [280],  and  matrix  inequality 
system  [509],  have  been  proposed. 

Many  of  the  discrete  constrained  algorithms  eliminate  one  variable  at  a 
time.  This  can  be  done  either  by  making  repeated  use  of  resolution,  as  was 
done  in  the  original  version  of  the  Davis-Putnam  (DP)  procedure  [US],  or 
by  assigning  some  variable  each  possible  value  and  generating  a  sub-formula 
for  each  value,  as  was  done  in  Loveland’s  modification  to  the  DP  procedure 
[117,  354].  Resolution  generates  only  one  new  formula,  but  in  the  worst 
case  the  number  of  clauses  in  that  new  formula  will  be  proportional  to  the 
square  of  the  number  of  clauses  in  the  original  formula.  Assigning  values  to 
a  variable  (often  called  searching)  generates  two  new  formulas.  For  random 
formulas,  resolution  methods  are  fast  when  the  number  of  clauses  is  small 
compared  to  the  number  of  values  [166,  86],  while  search  methods  are  fast 
except  when  the  number  of  clauses  is  such  that  the  expected  number  of 
solutions  is  near  one  [430].  The  two  approaches  can  be  combined,  using 
resolution  on  some  variables  and  search  on  others. 

Other  specific  algorithms  using  these  principles  include  simplified  DP 
algorithms  [181,  203,  427],  and  a  simplified  DP  algorithm  with  strict  or¬ 
dering  of  variables  [268].  The  DP  algorithm  improved  in  certain  aspects 
over  Gilmore’s  proof  method  [197].  Analyses  of  SAT  algorithms  often  con¬ 
centrates  on  algorithms  that  are  simple  because  it  is  difficult  to  do  a  correct 
analysis  of  the  best  algorithms.  Under  those  conditions  where  simple  algo¬ 
rithms  are  fast,  related  practical  algorithms  are  also  fast.  (It  is  difficult  to 
tell  whether  a  practical  algorithm  is  slow  under  conditions  that  make  the 
corresponding  simplified  algorithm  slow.) 

A  number  of  special  SAT  problems,  such  as  2-satisfiability  and  Horn 
clauses,  are  solvable  in  polynomial  time  [5,  101,  394].  There  are  several 
linear  time  algorithms  [18,  155]  and  polynomial  time  algorithms  [399,  459] 
existing. 

Discrete,  unconstrained  algorithms.  In  this  approach,  the  number  of  unsatis- 
fiable  CNF  (or  satisfiable  DNF)  clauses  is  formulated  as  the  value  of  the  ob¬ 
jective  function,  transforming  the  SAT  formula  into  a  discrete,  unconstrained 
minimization  problem  to  the  objective  function.  Local  search  is  a  major  class 
of  discrete,  unconstrained  search  methods  [211,  212,  226,  220,  400,  469]. 
It  can  be  used  to  solve  the  transformed  formula  (see  Section  7). 

Constrained  programming  algorithms.  Methods  in  this  class  were  developed 
based  on  the  fact  that  CNF  or  DA'F  formulas  can  be  transformed  to  instances 
of  Integer  Programming,  and  possibly  solved  using  Linear  Programming  re¬ 
laxations  [35,  257,  258,  284,  301,  299,  398,  545].  Many  approaches, 
including  branch-and-bound  [35],  cutting- plane  [258,  256],  branch-and-cut 
[259],  interior-point  [301,  299],  and  improved  interior- point  [476],  have 
been  proposed  to  solve  the  integer  program  representing  the  inference  prob¬ 
lem.  Researchers  found  integer  programming  methods  faster  than  resolution 
for  certain  classes  of  problems,  although  these  methods  do  not  possess^  a 
robust  convergence  property  and  often  fail  to  solve  hard  instances  of  satis¬ 
fiability  [35,  257,  258,  284,  301,  299]. 
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Discrete  < 


r 


Constrained  < 


1983:  Parallel  CLP  algorithms  [514,  369] 

1986:  Parallel  DRA  chips  [206,  224,  225] 

1987:  Parallel  DP  algorithm  [87] 

1988:  Parallel  AC  algorithms  [458] 

1988:  Parallel  CSP  architectures  [206,  225] 

1990:  Unison  algorithm  and  hardware  [489,  490] 
1992:  Vectorized  DP  algorithm  [157] 

1994:  MIMD  DP  algorithm  [38] 


< 


Unconstrained 


'  1987:  CNF  local  search  [207,  212] 

1987:  DiVF  local  search  [207,  217] 

<  1987:  Parallel  local  search  [207,  212] 

1991:  Discrete  aP  relaxation  [208] 

1993:  Multiprocessor  local  search  [487,  486] 


Constrained 


{  1989:  Interior  point  method  [301,  299] 


Continuous  < 


V 


Unconstrained 


"  1987:  models  [207,  217] 

1987:  Global  optimization  (SAT6)  [207,  217] 
1991:  Continuous  a/3  relaxation  [208] 

1991:  SAT14  algorithms  [217] 

1991:  Parallel  global  optimization  [210,  217] 
1992:  Neurocomputing  [218] 


Figure  6.  Some  parallel  SAT/CSP  algorithms. 


•  Unconstrained j  global  optimization  algorithms.  Special  models  have  been 
formulated  to  transform  a  discrete  formula  on  Boolean  space  {0,  l}’^  (a  de^ 
cision  problem)  into  an  unconstrained  UniSAT problem  on  real  space  (an 
unconstrained  global  optimization  problem).  The  transformed  formulas  can 
be  solved  by  many  existing  global  optimization  methods  [207,  211,  210, 
217,  226]  (see  Section  8). 

4.4.  Parallel  SAT  Algorithms.  In  practice,  most  sequential  SAT  algorithms 
can  be  mapped  onto  parallel  computer  systems,  resulting  in  parallel  SAT  algorithms 
[218],  A  speedup  greater  than  the  number  of  processors  sometimes  occurs  because 
of  correlations  among  variable  settings  that  lead  to  solutions  [383,  338],  Accord- 
ingly,  as  given  in  Figure  6,  there  are  four  classes  of  parallel  algorithms  for  solving 
SAT. 

•  Parallel^  discrete,  constrained  algorithms.  Many  discrete,  constrained  SAT/CSP 
algorithms  have  been  implemented  in  parallel  algorithms  or  put  on  special- 
purpose,  hardware  VLSI  architectures.  These  include  parallel  consistent  la¬ 
beling  algorithms  [514,  369],  parallel  discrete  relaxation  (DRA)  chips  [224, 
206,  225],  parallel  arc  consistency  (PAC)  algorithms  [458],  parallel  con¬ 
strained  search  architectures  [206,  225],  parallel  Unison  algorithms  [489], 
parallel  Unison  architectures  [490],  parallel  DP  algorithms  [38,  87,  157], 
and  parallel  logical  programming  languages  [99,  350,  529,  530,  531], 
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Sequential  f  VAX-8600 
Machines  \  SUN  workstations 
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General-Purpose  I 
Programming  j 


Parallel 

Machines 


1983:  Multicomputers  [514,  369] 

1987:  CRAY  [457] 

1988:  BBN  butterfly  plus  [456] 

<  1988:  Connection  machine  [103] 

1989:  KORBX  vector  computer  [301,  299] 
1992:  ETAIOQ  Vector  computer  [157] 
1994:  INMOS  Transputer  [38] 


Sequential  f  1980:  Analog  processor  [371] 
Machines  1986:  DRAl  architectures  [317,  224] 


Special-Purpose 

Architectures 


Parallel 

Machines 


1986:  DRA  architectures  [206,  224,  225] 
1986:  mDRA  architectures 
1987:  CSP  architectures  [206] 

1987:  mCSP  architectures  [206] 

1988:  DRA  model  architecture  [103] 

1989:  DRA  model  architecture  [349] 

1990:  Unison  architectures  [489,  490] 


Figure  7.  Computer  architectures  used  for  running  SAT/CSP  algorithms. 


•  Parallel,  discrete,  unconstrained  algorithms.  A  number  of  discrete  local  op 
timization  algorithms  were  implemented  on  parallel  computing  machines. 
These  include  CNF  local  search  [207,  212],  DNF  local  search  [207,  217], 
parallel  local  search  [207,  212],  and  multiprocessor  local  search  [487,  486]. 
A  new  a/3  relaxation  technique  was  developed  in  a  parallel  and  distributed 
environment  [208]. 

•  Parallel,  constrained  programming  algorithms.  Kamath  et  al.  implemented 
an  interior  point  zero-one  integer  programming  algorithm  on  a  KORBX(R) 
parallel/ vector  computer  [301,  299]. 

•  Parallel,  unconstrained,  global  optimization  algorithms.  Several  of  these 
algorithms  have  been  implemented:  UniSAT  models  [207,  217],  parallel, 
continuous  q/3  relaxation  [208],  and  parallel  global  optimization  algorithms 

[210,  217]. 

Computer  architectures  affect  the  data  structures,  implementation  details,  and 
thus  the  performance  of  SAT  algorithms.  A  variety  of  computer  systems  have 
been  used  for  running  SAT  algorithms  (Figure  7).  Most  early  studies  of  CSP/SAT 
algorithms  were  performed  on  sequential  computers.  Recent  work  has  been  concen¬ 
trated  on  parallel  programming  on  multiprocessors.  McCall  et  al.  [369,  514]  sim¬ 
ulated  an  8-processor  architecture  with  various  system,  topology  and  performance 
criteria  for  the  forward  checking  CSP  algorithm.  Samal  implemented  several  paral¬ 
lel  AC  algorithms  on  a  CRAY  computer  [457]  and  an  18-node  BBN  Butterfly  Plus 
MIMD,  shared-memory,  homogeneous  parallel  processor  [456].  Cooper  and  Swam 
implemented  parallel  AC  algorithms  on  a  Connection  Machine  [103].  Kamath  and 
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Karmarkar  et  al  implemented  an  interior  point  zero-one  integer  programming  al¬ 
gorithm  for  SAT  on  a  KORBX(R)  parallel/ vector  computer  [301,  299],  Recently, 
Fang  and  Chen  implemented  a  vectorized  DP  algorithm  on  an  ETAIOQ  vector  com¬ 
puter  [157].  Speckenmeyer  and  Bohm  have  experimented  with  the  parallelization 
of  variants  of  the  Davis-Putnam-Loveland  (DPL)  procedure  on  a  message  based 
MIMD  Transputer  system  built  with  320  (INMOS  T800/4MB)  processors  [38]. 
In  their  implementation,  for  some  small  fc,  each  of  2^  processors  solves  a  formula 
arising  at  depth  fc  of  a  DPL  search  tree,  and  computation  ceases  as  soon  as  one 
processor  reports  that  its  formula  is  satisfiable.  Speckenmeyer  noticed  that  the 
time  to  completion  was  usually  less  than  iV/2^  where  N  is  the  time  taken  by  the 
serial  version  [38]. 

Research  works  continue  by  building  special-purpose  VLSI  architectures  to 
speed  up  SAT/CSP  computations.  For  an  n- variable  and  m- value  instance  of  CSP, 
Wang  and  Gu  [540,  541]  gave  an  0{'n?(f)  time  parallel  DRA2  algorithm  and 
an  SIMD  DRA2  architecture.  Furthermore,  Gu  and  Wang  [224]  gave  an  O(n^d) 
time  parallel  DR  A3  algorithm  and  a  dynamic  DR  A3  architecture  for  solving  gen¬ 
eral  DRA  problems.  Later,  Gu  and  W^ang  [206,  225]  developed  an  0{nd)  time, 
massively  parallel  DRA5  algorithm  and  a  parallel  DRA5  VLSI  architecture.  For 
problems  of  practical  interest,  parallel  DRA  algorithms  running  on  special-purpose 
VLSI  architectures  offer  many  orders  of  magnitude  in  performance  improvement 
over  sequential  algorithms. 

Recently,  Sosic,  Gu,  and  Johnson  have  developed  a  number  of  parallel  algo¬ 
rithms  and  architectures  for  differential,  non-clausal  inference  of  SAT  formulas 
[489,  490]. 

An  extreme  example  of  parallel  processing  is  to  compute  using  chemistry  with 
DNA  molecules.  This  would  appear  to  lead  a  factor  of  about  10^^  degrees  of 
parallelism  with  a  slow  down  of  perhaps  10^°  in  the  time  for  computation  steps,  but 
this  approach  has  not  been  investigated  in  enough  detail  to  determine  its  practical 
limitations  [3,  352].  This  SAT  evaluation  approach  is  both  parallel  and  random 
—  if  it  says  you  have  a  solution  then  definitely  you  do,  if  it  says  you  do  not  then 
probably  you  do  not. 

4.5.  Algorithm  Categories.  Some  SAT  algorithms  are  complete  (they  def¬ 
initely  determine  whether  an  input  has  a  solution  or  does  not  have  one)  [59, 
118,  117,  354,  446],  while  others  are  incomplete  (they  sometimes  determine 
whether  or  not  the  input  has  a  solution,  but  in  other  cases  they  cannot  find  one) 

[212,  217,  299,  399]. 

Most  incomplete  algorithms  find  one  solution  (or  perhaps  several  solutions)  in 
favorable  cases,  but  give  up  or  do  not  terminate  in  other  cases.  In  such  cases  one 
does  not  know  whether  the  input  has  no  solution  or  the  algorithm  did  not  search 
hard  enough.  Some  incomplete  algorithms  can  verify  that  a  formula  has  no  solution 
but  can  not  find  one  if  at  least  one  solution  exists.  Such  is  the  case  for  incomplete 
algorithms  that  check  for  patterns  that  imply  unsatisfiability.  In  the  strict  sense 
of  the  word  algorithm,  incomplete  algorithms  are  not  algorithms  at  all,  but  such 
procedures  are  of  particular  interest  for  inputs  that  are  so  difficult  that  a  complete 
algorithm  cannot  solve  them  in  reasonable  time. 

Complete  algorithms  can  perform  one  of  the  following  actions:  (1)  determine 
whether  or  not  a  solution  exists,  (2)  give  the  variable  settings  for  one  solution, 
(3)  find  all  solutions  or  an  optimal  solution,  (4)  prove  that  there  is  no  solution. 
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Algorithms  of  the  first  type  would  be  of  theoretical  interest  only  were  it  not  for  the 
fact  that  any  such  algorithm  can  be  modified,  with  little  loss  of  efficiency,  to  give 
an  algorithm  of  the  second  type.  Algorithms  of  the  third  type  are  needed  v^hen 
there  is  some  measure  of  the  solution  quality,  and  the  optimal  solution  is  sought  or 
when  the  overall  problem  has  constraints  in  addition  to  those  of  the  SAT  instance. 
The  algorithms  are  essential  to  many  important  practical  applications  that  are 
NP-hard  in  nature.  Recently,  Major  et.  al.  used  SAT  to  precede  a  program  to 
calculate  chemical  interaction  energies  to  predict  RNA  folding  [359].  Gu  and  Pun 
developed  an  efficient  complete  SAT  algorithm  for  asynchronous  computer  circuit 
design,  aiming  at  producing  the  minimal  circuit  structure  [223,  435].  Incomplete 
algorithms  cannot  optimize  solution  quality,  playing  little  role  in  solving  practical 
optimization  problems. 

Requiring  a  program  to  produce  each  solution  in  explicit  form  ensures  that 
the  worst-case  time  will  be  exponential  whether  or  not  P  =  NP  (because  some 
inputs  have  an  exponential  number  of  solutions).  An  alternative  is  to  give  the 
solutions  in  some  compressed  form.  For  example,  some  algorithms  implicitly  list 
all  solutions  by  giving  cylinders  of  solutions,  i.e.,  the  settings  of  some  variables 
with  the  understanding  that  the  remaining  variables  are  don’t  cares  which  can  have 
any  value.  For  some  formulas,  using  this  approach  to  represent  all  solutions  is 
much  more  compact  than  an  explicit  representation  [64,  373].  Binary  Decision 
Diagrams  (BDD)  are  a  more  sophisticated  and  compact  way  to  represent  the  set  of 
all  solutions  [59,  17].  Some  instances  of  SAT,  however,  have  a  structure  such  that 
it  is  faster  to  generate  the  solution  to  various  subsets  of  the  constraints  (depending 
on  a  subset  of  the  variables)  and  then  test  whether  those  various  solution  sets  have 
anything  in  common  rather  than  try  to  solve  the  entire  formula  at  once.  This 
type  of  SAT  algorithm  shows  greater  efficiency  improvements  for  certain  practical 
engineering  design  problems  [436]. 

The  techniques  used  in  complete  SAT  algorithms  can  usually  be  adapted  to 
provide  exact  solutions  to  optimization  problems.  The  techniques  used  in  incom¬ 
plete  SAT  algorithms  can  usually  be  adapted  to  provide  approximate  solutions  to 
optimizations  problems.  They  normally  lead  to  algorithms  that  produce  low  (but 
not  necessarily  the  lowest)  value  of  the  function. 

For  random  sets  of  formulas,  the  probability  that  a  particular  formula  has  at 
least  one  solution  is  perhaps  the  most  important  parameter  for  deterrnining  how 
difficult  the  set  will  be  for  a  particular  algorithm.  The  best  known  algorithrns  have 
difficulty  when  the  probability  is  near  0.5,  but  are  fast  when  the  probability  is  close 
to  0  or  1.  We  use  formulas  generated  from  the  3-SAT  model  as  an  example.  Figure 
8  shows,  for  50  variables,  the  real  execution  results  of  the  DP  algorithm  for  100  to 
500  clauses.  The  computing  time  used  by  a  program  for  the  DP  algorithm  ([118, 
117,  354])  is  shown  for  the  3-SAT  model  (solid  line)  and  the  average  3-SAT  model 
(dotted  line)  [207,  212,  380,  109].  Random  formulas  generated  in  the  left  region 
are  usually  satisfiable,  and  the  procedure  is  fast.  Random  formulas  in  the  right 
region  are  usually  unsatisfiable,  and  the  procedure  is  fast.  For  random  forinulas  in 
the  middle,  many  are  satisfiable  and  many  are  unsatisfiable;  the  procedure  is  slow. 
Because  the  DP  algorithm  is  a  complete  algorithm,  it  is  able  to  verify  satisfiability 
and  unsatisfiability.  So  it  gives  results  for  random  formulas  in  all  three  regions. 

The  results  of  the  DP  algorithm  may  not  hold  for  a  different  SAT  algorithm. 
A  local  search  may  often  find  a  solution  for  a  satisfiable  CNF  much  more  quickly 
than  the  DP  algorithm  but  does  not  always  verify  satisfiability  and  cannot  prove 
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number  of  clauses 


Figure  8.  Computing  time  for  the  exact  and  the  average  3-SAT 
models  (with  50  variables)  on  a  SUN  SPARC  1  workstation.  The 
horizontal  axis  is  measured  by  m  or  m/n. 


unsatisfiability.  In  particular,  it  gives  no  answer  if  a  CNF  formula  is  not  satisfiable. 
Thus,  for  most  formulas  in  the  peak  region  and  nearly  all  formulas  in  the  right 
region,  a  local  search  algorithm  will  not  terminate  in  a  reasonable  amount  of  time. 

4.6.  Performance  Evaluation.  The  performance  of  an  algorithm  can  be  de¬ 
termined  experimentally  or  analytically.  It  is  feasible  to  do  experimental  studies 
with  typical  or  random  formulas,  but  not  with  worst-case  formulas  (there  are  too 
many  formulas  of  a  given  size  to  experimentally  determine  which  one  leads  to  the 
worst-case  time).  It  is  feasible  to  do  analytical  studies  with  random  or  worst-case 
formulas  but  not  with  typical  formulas  (typical  sets  of  formulas  seldom  have  a 
mathematical  structure  suitable  for  analysis). 

Experimental  studies  are  sometimes  inconclusive  because  they  consider  a  rela¬ 
tively  small  number  of  input  possibilities.  Such  restrictions  are  often  forced  because 
the  space  of  likely  input  formulas,  and  even  the  size  of  such  formulas,  is  so  large. 
Analytical  studies  are  intended  to  determine  performance  over  broad  families  of  in¬ 
puts  where  each  family  typically  represents  a  class  of  formulas  of  a  particular  size. 
However,  such  studies  have  the  drawback  that  only  the  simplest  of  algorithms  can 
be  analyzed.  To  compensate  for  this,  several  features  of  a  complex  algorithm  can 
be  removed,  leaving  a  rather  simple,  more  analyzable  one.  The  simplified  algorithm 
usually  contains  one  or  two  simple  techniques,  such  as  the  unit-clause-rule,  or  the 
pure-literal-rule.  An  analytical  result  on  the  simplified  algorithm  provides  a  bound 
on  the  performance  of  the  complex  algorithm,  and  this  bound  is  sometimes  suffi¬ 
cient  to  understand  the  behavior  of  the  complex  algorithm.  Such  an  approach  has 
the  following  side  benefit:  analytical  studies  can  suggest  which  simple  techniques 
should  be  included  in  practical  algorithms.  In  fact,  most  of  the  6  prize  winners  of 
the  1991  SAT  contest  were  associated  with  analytical  studies  of  SAT  algorithms 
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[69,  70].  The  two  top  winners  were  associated  with  both  experimental  and  analyt¬ 
ical  studies  of  SAT  algorithms.  The  analytical  studies  of  SAT  algorithms  involve 
the  following. 

1.  Worst-Case  Studies. 

Unless  P  =  NP,  all  SAT  algorithms  have  a  worst-case  time  that  is  superpoly¬ 
nomial  in  the  input  size  [101].  A  number  of  studies  have  concentrated  on  the 
worst-case  analysis  of  variable  setting  algorithms  for  solving  SAT  [192,  382,  318]. 

2.  Probabilistic  Studies. 

Since  the  typical  performance  of  many  satisfiability  algorithms  is  much  better 
than  any  proven  worst-case  results,  there  is  considerable  interest  in  evaluating  the 
probabilistic  performance  of  these  algorithms.  Such  studies  use  some  model  for 
generating  random  formulas  and  then  calculate  the  performance  of  algorithms  on 
these  formulas.  The  two  most  widely  used  measures  of  performance  are  average 
time  and  probabilistic  time. 

Average  time  is  a  weighted  average  of  the  time  (or  some  related  measure,  such 
as  the  number  of  nodes)  to  solve  a  given  sample  of  formulas.  An  algorithm  must 
solve  each  formula  for  the  average  to  be  defined.  In  probabilistic  time  studies,  an 
algorithm  is  given  a  deadline  (usually  specified  as  a  polynomial  in  the  length  of 
input  formulas),  and  one  studies  the  fraction  of  formulas  that  are  solved  within 
the  deadline.  Probabilistic  time  studies  can  be  performed  on  algorithms  which  give 
up  on  some  fraction  of  the  formulas  so  long  as  that  fraction  is  less  than  the  goal 
fraction. 

For  incomplete  algorithms,  the  average  time  is  not  defined  so  only  the  fraction 
of  inputs  solved  can  be  studied.  One  can  also  use  various  hybrid  measures,  such  as 
the  average  time  used  to  solve  the  easiest  90  percent  of  the  inputs. 

The  literature  contains  a  number  of  studies  of  the  average  time  and  probabilistic 
time  performance  of  certain  SAT  algorithms  [53,  168,  414,  167,  201,  202,  203, 
268,  421,  429,  424].  Despite  the  worst-case  complexity  of  SAT,  algorithms  and 
heuristics  with  polynomial  average  time  complexities  have  been  reported  [82,  83, 
165,  422,  426,  427,  428,  549].  This  subject  is  treated  in  more  detail  in  Section  12. 

3.  Number  of  Solutions. 

Some  researchers  investigated  the  number  of  solutions  of  random  SAT  formulas. 
Extending  Iwama’s  work  [280]  Dubois  gave  a  combinatorial  formula  computing  the 
number  of  solutions  of  any  set  of  clauses  [148].  Dubois  and  Carlier  also  studied 
the  mathematical  expectation  of  the  number  of  solutions  for  a  probabilistic  model 
[149]. 

'  During  the  past  two  decades,  many  performance  studies  were  performed  through 
sampling  techniques  [311,  421,  498],^  experimental  simulations  [54,  191,  240], 


^Knuth  [311]  first  showed  how  to  measure  the  size  of  a  backtrack  tree  by  repeatedly  following 
random  paths  from  the  root.  Purdom  [421]  gave  a  modified  version  of  Knuth’s  algorithm  which 
greatly  increases  the  efficiency  of  Knuth’s  method  by  occasionally  following  more  than  one  path 
from  a  node.  Stone  and  Stone  [498]  presented  a  variant  of  the  algorithms  of  Knuth  and  Purdom 
for  estimating  the  size  of  the  unvisited  portion  from  the  statistics  of  the  visited  portion. 
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analytical  studies  [53,  176,  177,  429,  422,  426,  497],  as  well  as  the  combined 
effort  of  the  above  approaches  [191,  265,  311,  421,  497]. 

5.  SAT  Input  Models 

In  this  section,  we  describe  several  basic  SAT  input  models  and  their  charac¬ 
teristics. 


5.1.  Random  Input  Models.  The  running  time  of  a  SAT  algorithm  depends 
on  the  type  of  input  being  solved.  The  following  SAT  input  models  are  often  used 
to  generate  a  variety  of  input  types. 

•  Hardest  formulas.  Generate  that  formula  that  is  the  most  difficult  for  the 
algorithm  being  measured.  This  approach  is  often  used  for  analytical  studies. 
There  are  too  many  possible  formulas  to  use  this  approach  in  experimental 
studies.  Bugrara  and  Brown  [65]  reported  the  effects  these  minor  variations 
have  on  the  average  time  needed  by  the  simple  backtracking  algorithm. 

Experimental  studies  sometimes  include  results  for  the  hardest  formulas 
from  the  set  of  formulas  tested,  but  such  results  are  quite  different  from  what 
the  results  would  be  if  the  entire  set  of  possible  formulas  had  been  tested. 

Most  analytical  studies  use  the  following  two  basic  models  to  generate  random 
CiVF  formulas.  Each  model  has  several  variations  depending  on  whether  identical 
clauses  are  permitted,  whether  a  variable  and  its  negation  can  occur  in  a  clause, 
etc. 

•  The  Z-SAT  model.  In  the  Z-SAT  model,  a  randomly  generated  CNF  for¬ 
mula  consists  of  m  independently  generated  random  clauses.  Each  clause 
is  chosen  uniformly  from  the  set  of  all  possible  clauses  of  exactly  Z  literals 
that  can  be  composed  from  a  variable  set  X  =  {xi,...  ,Xn}  such  that  no 
two  literals  are  equal  or  complementary.  The  number  of  possible  clauses  is 
2^(”).  This  model  is  sometimes  called  t)ie  fixed- clause-length  model.  Similar 
models  were  used  in  [53,  82,  83,  168,  414,  212,  217,  380,  373,  426]. 

•  The  average  Z-SAT  model.  In  the  average  Z-SAT  model,  a  randomly  gen¬ 
erated  CNF  formula  consists  of  m  independently  generated  random  clauses. 
In  each  clause,  each  of  n  variables  occurs  positively  with  probability  p(l  ~p), 
negatively  with  probability  p{l  —  p),  both  positively  and  negatively  with 
probability  p^,  not  at  all  with  probability  (1  —p)^,  where  p  can  be  a  function 
of  m  and  n.  The  average  number  of  literals  in  a  clause  is  Z  =  2pn.  This  model 
is  also  called  the  random- clause-length  model.  This  model  and  variations 
were  used  in  [165,  166,  203,  211,  212,  217,  258,  259,  301,  299,  476]. 

Most  papers  use  just  one  model,  but  the  performance  of  simple  backtracking 
has  been  considered  under  a  number  of  related  models  [65]. 

5.2.  Hardness.  Various  SAT  algorithms  differ  greatly  in  the  amount  of  time 
they  need  to  solve  particular  inputs.  For  example,  Iwama’s  algorithm  [280]  is  fast 
for  random  formulas  wdth  lots  of  solutions  and  slow  for  random  formulas  with  few 
solutions,  while  simple  backtracking  [428]  is  fast  on  formulas  with  few  solutions  and 
slow  on  formulas  with  many  solutions.  Therefore,  the  hard-and-easy  distributions 
of  SAT  formulas  depend  not  only  on  the  inherent  property  of  the  SAT  input  models 
but  also  on  the  algorithms  used  to  solve  the  formulas.  Any  particular  SAT  formula 
is  easy  for  some  algorithm  (for  example  a  table  lookup  algorithm  with  that  formula 
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Figure  9.  Percentage  of  satisfiability  for  formulas  with  50  vari¬ 
ables  generated  by  the  3-SAT  and  the  average  3-SAT  input  models, 
respectively.  The  horizontal  axis  is  measured  by  m  or  m/n. 

in  its  lookup  table).  Thus,  hardness  is  a  property  of  large  sets  of  formulas  rather 
than  individual  formulas. 

For  sets  of  formulas  generated  by  random  models  with  parameters,  the  proba¬ 
bility  of  finding  a  solution  varies  with  the  parameter  settings.  Those  sets  generated 
with  parameters  set  in  regions  where  solutions  are  going  from  unlikely  to  common 
are  particularly  difficult  for  all  algorithms  that  have  been  studied  (see  Figure  9). 

For  random  1-SAT  formulas  fewer  literals  and  larger  number  of  clauses  reduce 
the  possibility  of  making  all  clauses  jointly  satisfiable.  Therefore  the  computing 
time  for  random  /-SAT  formulas  increases,  up  to  a  point,  when  m/n  increases  or 
the  number  of  literals  /  (/  >  3)  in  each  clause  decreases  (Figure  8).  Inspection  of 
Figure  8  reveals  a  “hump”  of  difficulty  for  /-SAT  formulas  where  50%  of  the  sample 
space  is  satisfiable,  but  a  “flat”  increase  in  difficulty  for  random  /-SAT  formulas  in 
a  correspondingly  similar  region  of  the  parameter  space. 

5.3.  Comparison  of  Random  Input  Models.  The  structural  properties 
of  random  formulas  generated  by  the  two  input  models  given  above  can  be  quite 
different  and  this  can  have  a  significant  impact  on  the  performance  of  a  complete 
S.4T  algorithm.  This  significance  is  felt  especially  in  the  region  of  the  parameter 
space  for  which  random  formulas  are  nearly  equally  likely  to  be  satisfiable  or  unsat- 
isfiable.  Figure  8  shows,  for  50  variables,  the  actual  computing  time  of  a  complete 
S.4T  (5AT14.il)  algorithm^  for  random  formulas  generated  from  the  3-SAT  model 
and  the  average  3-SAT  model  [207,  212,  216,  217].  Figure  9  shows  the  percent 
of  random  formulas  that  are  satisfiable  as  a  function  of  formula  size  for  both  mod¬ 
els.  For  a  complete  algorithm,  the  problem  instances  generated  from  the  average 
3-S.4T  model  is  much  easier  than  those  generated  from  the  3-SAT  model.  It  takes 

^SAT14.11  is  a  backtracking  aigorithm  combined  with  coordinate  descent  in  the  reai 
space  [217]. 
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Figure  10.  Percentage  of  satisfiability  for  two  average  3-SAT 
problem  models  with  \C\min  =  1  and  \C\min  =  2  (50  variables), 
respectively.  Problem  instances  generated  with  smaller  length  of 
the  shortest  clauses  have  much  lower  percentage  of  satisfiability. 

a  complete  algorithm  much  less  computing  time  to  solve  formulas  generated  from 
the  average  3-SAT  model. 

For  an  incomplete  algorithm  such  as  local  search,  however,  the  situation  is  dif¬ 
ferent.  In  Figure  9,  for  the  same  number  of  clauses  (i.e.,  the  same  m/n  values), 
problem  instances  generated  from  the  average  3-SAT  model  have  much  lower  per¬ 
centage  of  satisfiability  (compared  to  those  generated  from  the  3-SAT  model).  The 
sat-and-unsat  boundary  of  the  average  3-SAT  model  is  shifted  to  the  left  and  is 
drawn  by  smaller  m/n  values  than  those  for  the  3-SAT  problem  model.  For  the 
same  m/n  values,  more  problem  instances  generated  from  the  average  3-SAT  model 
are  unsatisfiable,  making  it  harder  for  a  local  search  algorithm  to  handle  the  av¬ 
erage  3-SAT  problem  model.  Experimental  results  confirmed  that  it  took  a  local 
search  algorithm  {SATl  for  example)  much  longer  time  to  solve  problem  instances 
generated  from  the  average  ?-SAT  models  [211]. 

Many  factors  can  affect  the  property  of  the  random  models  significantly.  For 
the  same  average  3-SAT  problem  model  even  a  slight  variation  to  the  length  of 
the  shortest  clause  in  a  CNF  formula  would  significantly  shift  the  sat-and-unsat 
boundary.  In  Figure  10,  the  solid  curve  was  generated  from  an  average  3-SAT 
model.  The  length  of  the  shortest  clause  in  the  model  was  1.  The  dotted  curve  w'as 
generated  from  the  same  average  3-SAT  model  but  the  length  of  the  shortest  clause 
was  set  to  2.  Clearly,  shorter  clauses  enforce  tighter  constraints  and  generate  much 
harder  random  instances  for  the  same  model. 

Incomplete  algorithms  that  fail  on  unsatisfiable  inputs  can  be  effective  only  in 
the  half-planes  m/n  <  2^/1  for  the  Z-SAT  model  and  pa  >  ln(m)  for  the  average 
/-SAT  model,  where  the  probability  that  a  random  formula  is  satisfiable  is  high  (see 
Section  12).  Incomplete  algorithms  that  fail  on  satisfiable  inputs  can  be  effective 
only  in  regions  complementary  to  those  above. 
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Experience  with  the  best  complete  algorithms  has  caused  some  to  conclude  the 
following: 

1.  Average  /-SAT  formulas  are  easy  for  the  best  algorithms; 

2.  /-SAT  formulas  are  difficult  even  for  the  best  algorithms;  and 

3.  Formulas  generated  by  both  models  are  of  similar  difficulty  when  the  average 
clause  length  is  large. 

Obviously,  there  are  some  conflicts  in  these  beliefs. 

5.4.  Practical  Input  Models.  Random  input  models  such  as  those  discussed 
above  are  suitable  for  analytical  studies  of  SAT  algorithms  because  they  generate 
formulas  which  possess  a  symmetry  that  can  be  exploited  for  analysis.  Actual 
formulas  often  have  a  different  structure.  Therefore,  structured  problem  instances 
and  practical  SAT  applications  are  essential  to  evaluate  the  performance  of  SAT 
algorithms.  Examples  of  these  are  the  following: 

•  Regular  SAT  models.  Models  derived  from  problems  such  as  graph  col¬ 
oring  and  n-queens,  are  used  to  assess  the  performance  of  SAT  algorithms 
[210,  217]. 

•  Practical  applications  problems.  Models  derived  from  practical  appli¬ 
cation  domains,  such  as  integrated  circuit  design,  mobile  communications, 
computer  architecture  and  network  design,  computer-aided  manufacturing, 
and  real-time  scheduling,  have  a  variety  of  special  characteristics  (see  Section 

14). 

Some  experiments  strongly  suggest  that  there  is  little  correlation  between  the 
performance  of  a  SAT  algorithm  tested  through  random  input  models  and  the  per¬ 
formance  of  the  same  algorithm  tested  through  practical  input  models.  Local  search 
is  faster  for  some  random  inputs  but  can  be  slower  than  a  complete  SAT  algorithm 
for  problems  raised  from  practical  applications.  The  boundary  phenomenon  dis¬ 
cussed  in  random  models  is  an  artifact  of  some  probabilistic  models.  It  has  not  yet 
been  observed  in  practical  input  models. 

Practical  applications  are  ultimately  the  most  important,  although  it  is  difficult 
for  people  outside  the  area  of  application  to  understand  how  important  or  difficult 
a  particular  application  problem  is.  It  is  also  difficult  to  develop  a  general  theory 
on  the  speed  of  SAT  algorithms  on  applications.  Much  research  is,  therefore,  done 
on  the  more  regular  source  of  problems  in  the  hope  of  better  understanding  the 
speed  that  SAT  algorithms  will  have  when  applied  to  a  wide  range  of  practical 
applications. 


6.  Splitting  and  Resolution 

Recursive  replacement  of  a  formula  by  one  or  more  other  formulas,  the  solution 
of  which  implies  the  solution  of  the  original  formula,  is  an  effective  paradigm  for 
solving  CiVF  formulas.  Recursion  continues  until  one  or  more  primitive  formulas 
have  been  generated  and  solved  to  determine  the  satisfiability  of  the  original.  One 
way  to  achieve  this  is  through  splitting. 

In  splitting,  a  variable  v  is  selected  from  a  formula,  and  the  formula  is  replaced 
by  one  sub-formula  for  each  of  two  possible  truth  assignments  to  v.  Each  sub¬ 
formula  has  all  the  clauses  of  the  original  except  those  satisfied  by  the  assignment 
to  V  and  otherwise  all  the  literals  of  the  original  formula  except  those  falsified  by 
the  assignment.  Neither  sub-formula  contains  v,  and  the  original  formula  has  a 
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satisfying  truth  assignment  if  and  only  if  either  sub-formula  has  one.  Splitting 
insures  that  a  search  for  a  solution  terminates  with  a  result. 

Another  effective  paradigm  is  based  on  resolution  [446].  In  resolution,  a  vari¬ 
able  V  is  selected  and  a  resolvent  (see  below)  obtained  using  v  is  added  to  the 
original  formula.  The  process  is  repeated  to  exhaustion  or  until  an  empty  clause  is 
generated.  The  original  formula  is  not  satisfiable  if  aiid  only  if  an  empty  clause  is  a 
resolvent.  Although  there  is  only  one  new  formula  on  each  step,  the  total  number 
of  steps  (or  resolvents)  can  be  extremely  large  compared  to  the  number  of  clauses 
in  the  original  formula.  Many  algorithms  that  use  resolution  form  all  possible  re¬ 
solvents  using  a  particular  variable  at  one  time.  When  this  is  done,  the  original 
clauses  that  contain  the  variable  and  its  negation  may  be  dropped.  An  algorithm 
may  use  both  splitting  and  resolution. 

Early  examples  of  these  approaches  are  the  two  forms  of  the  Davis  Putnam 
procedure.  The  original  DP  procedure  used  resolution  [118]  while  the  revised 
version,  i.e.,  the  Davis-Putnam-Loveland  (DPL)  procedure,  used  splitting  [IIT, 
354].  Combining  splitting  with  depth-first  search  in  the  DPL  procedure  avoids 
memory  explosion  that  occurs  on  many  inputs  when  they  are  solved  by  the  original 
DP  procedure. 

Most  recursive  SAT  algorithms  use  the  following  primitive  conditions  to  stop 
the  recursion: 

1.  formulas  with  an  empty  clause  have  no  solution. 

2.  formulas  with  no  clauses  have  a  solution. 

3.  formulas  with  no  variables  {i.e.,  all  variables  have  been  aissigned  values)  are 
trivial. 

The  following  subsections  present  various  SAT  algorithms,  organized  by  the 
basic  approach  that  each  algorithm  takes.  Some  of  these  algorithms  are  much 
simpler  than  you  would  want  to  use  in  practice  but  are  of  interest  because  it  has 
been  possible  to  analyze  their  running  time  for  random  formulcts. 

6.1.  Resolution.  Given  two  clauses  Ci  =  (v  V  xi  V  ...  V  xi^)  and  ^2  =  (t^  V 
yi  V  ...  V  yi2),  where  all  Xi  and  yj  are  distinct,  the  resolvent  of  Ci  and  C2  is  the 
clause  (xi  V  ...  V  xi^  V  yi  V  ...  V  yi^^y  that  is,  the  disjunction  of  Ci  and  C2  without  v 
or  V.  The  resolvent  is  a  logical  consequence  of  the  logical  and  of  the  pair  of  clauses. 
Resolution  is  the  process  of  repeatedly  generating  resolvents  from  original  clauses 
and  previously  generated  resolvents  until  either  the  null  clause  is  derived  or  until  no 
more  resolvents  can  be  created  [446].  In  the  former  case  (a  refutation)  the  formula 
is  unsatisfiable  and  in  the  latter  case  it  is  satisfiable. 

For  some  formulas  the  order  in  which  clauses  are  resolved  can  have  a  big  effect 
on  how  much  effort  is  needed  to  solve  it.  The  worst-case  associated  with  the  best 
possible  order  (the  order  is  selected  after  the  formula  is  given)  has  received  con¬ 
siderable  study  [181,  511,  231,  519].  These  studies  all  used  formulas  that  have 
no  solution,  but  where  this  is  not  obvious  to  the  resolution  algorithm.  Eventually 
a  much  stronger  result  was  shown:  nearly  all  random  /-SAT  formulas  need  expo¬ 
nential  time  w’hen  the  ratio  of  clauses  to  variables  is  above  a  constant  (whose  value 
depends  on  /)  [86].  The  constant  is  such  that  nearly  all  of  the  formulas  in  this  set 
have  no  solution. 

A  number  of  restrictions  and  at  least  one  extension  to  resolution  have  been 
proposed  and  applied  to  CNF  formulas.  Restrictions  aim  to  shorten  the  amount  of 
time  needed  to  compute  a  resolution  derivation  by  limiting  the  number  of  possible 
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resolvents  to  choose  from  at  each  resolution  step.  The  extension  aims  to  provide 
shorter  derivations  than  possible  for  resolution  alone  by  adding  equivalences  which 
offer  more  clauses  on  which  to  resolve.  A  nice  treatment  of  these  refinements  can 
be  found  in  [63],  Chapter  4.  We  mention  here  a  few  of  these. 

Set  of  Support  [548].  Split  a  given  formula  into  two  sets  of  clauses  T\  and 
T,  such  that  is  satisfiable.  Permit  only  resolutions  involving  one  clause  either 
in  Ts  or  an  appropriate  previous  resolvent.  Set  is  called  the  support  set.  This 
restriction  can  be  useful  if  a  large  portion  of  the  given  formula  is  easily  determined 
to  be  satisfiable. 

P-  and  N-Resolution.  If  one  of  the  two  clauses  being  resolved  has  all  positive 
literals  (resp.  negative  literals),  then  the  resolution  step  can  be  called  a  P-resolution 
(resp.  N-resolution)  step.  In  P-resolution  (resp.  N-resolution)  only  P-resolution 
(resp.  N-resolution)  steps  are  used.  Clearly  there  is  great  potential  gain  in  this 
restriction  due  to  the  usually  low  number  of  possible  resolvents  to  consider  at  each 
step.  However,  it  has  been  shown  that  some  formulas  solved  in  polynomial  time 
with  general  resolution  require  exponential  time  with  N-resolution. 

Linear  Resolution.  We  have  linear  resolution  if  every  resolution  step  except 
the  first  involves  the  most  recently  generated  resolvent  (the  other  clause  can  be  a 
previous  resolvent  or  a  clause  in  the  given  formula).  Depending  on  the  choice  of 
initial  clause  and  previous  resolvents  it  is  possible  not  to  complete  a  refutation. 

Regular  Resolution[511].  In  every  path  of  a  resolution  tree  no  variable  is 
eliminated  more  than  once. 

Davis-Putnam  Resolution.  Once  all  the  resolvents  with  respect  to  a  partic¬ 
ular  variable  have  been  formed,  the  clauses  of  the  original  formula  containing  that 
variable  can  be  dropped.  Doing  this  does  not  change  the  satisfiability  of  the  given 
formula,  but  it  does  change  the  set  of  solutions  to  the  extent  that  the  value  of  that 
variable  is  no  longer  relevant.  When  dropping  clauses,  it  is  natural  to  first  form  all 
the  resolvents  for  one  variable,  then  all  the  resolvents  for  a  second  variable,  and  so 
on.  When  doing  resolution  in  this  way,  it  is  easy  to  find  one  satisfying  assignment 
if  the  formula  is  satisfiable.  At  the  next  to  last  step  the  formula  has  just  one  vari¬ 
able,  so  each  value  can  be  tested  to  see  which  one  satisfies  the  formula  (perhaps 
both  will).  Pick  a  satisfying  value  and  plug  it  into  the  formula  for  the  next  step, 
converting  it  into  a  one  variable  formula.  Solve  that  formula  and  proceed  in  this 
manner  until  an  assignment  for  all  variables  is  found. 

Extended  Resolution  [511].  For  any  pair  of  variables  a,  6  in  a  given  formula 
T,  create  a  variable  z  not  in  T  and  append  the  following  expression  to  T'.  (z  V  a)  A 
{z  \  h)  l\{zy  aM  b).  Judicious  use  of  such  extensions  can  result  in  polynomial  size 
refutations  for  problems  that  have  no  polynomial  size  refutations  without  extension. 

The  following  strategies  help  reduce  the  time  to  compute  a  resolution  derivation. 
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Subsumption.  If  the  literals  in  one  clause  are  a  subset  of  those  in  another 
clause,  then  the  smaller  clause  is  said  to  subsume  the  larger  one.  Any  assignment 
of  values  to  variables  that  satisfies  the  smaller  clause  also  satisfies  the  larger  one,  so 
the  larger  one  can  be  dropped  without  changing  the  set  of  solutions.  Subsumption 
is  of  particular  importance  in  resolution  algorithms  because  resolution  tends  to 
produce  large  clauses. 

Pure  Literals.  A  literal  is  pure  if  all  its  occurrences  are  either  all  positive 
or  all  negative.  No  resolvents  can  be  generated  by  resolving  on  a  pure  literal,  but 
all  clauses  containing  a  pure  literal  can  be  removed  without  loss.  An  important 
improvement  to  the  basic  resolution  algorithm  is  to  first  remove  clauses  containing 
pure  literals  (before  resolving  on  non-pure  literals)  [118]. 

Although  resolution  can  be  applied  to  SAT,  the  main  reason  for  interest  in 
resolution  is  that  it  can  be  applied  to  the  more  difficult  problem  of  solving  sentences 
of  first  order  predicate  logic.  There  is  a  vast  literature  on  that  subject.  Bibel  has 
a  good  book  on  the  topic  [32]. 


6.2.  Backtracking.  Backtracking  algorithms  are  based  on  splitting.  During 
each  iteration,  the  procedure  selects  a  variable  and  generates  two  sub-formulas 
by  assigning  the  two  values,  true  and  false^  to  the  selected  variable.  In  each  sub¬ 
formula,  those  clauses  containing  the  literal  which  is  true  for  the  variable  assignment 
are  erased  from  the  formula,  and  those  clauses  which  contain  the  literal  which  is 
false  have  that  literal  removed.  Backtrack  algorithms  differ  in  the  way  they  select 
which  variable  to  set  at  each  iteration.  The  unit  clause  rule,  the  pure  literal  rule, 
and  the  smallest  clause  rule,  are  three  most  common  ones.  We  state  each  algorithm 
informally. 

The  flow  of  control  in  splitting-based  algorithms  is  often  represented  by  a  search 
tree.  The  root  of  the  tree  corresponds  to  the  initial  formula.  The  internal  nodes 
of  the  tree  correspond  to  sub-formulas  that  cannot  be  solved  directly,  whereas  the 
leaf  nodes  correspond  to  sub-formulas  that  can  be  solved  directly.  The  nodes  are 
connected  with  arcs  that  can  be  labeled  with  variable  assignments. 

Simple  Backtracking  [53].  If  the  formula  has  an  empty  clause  (a  clause 
which  always  has  value  false)  then  exit  and  report  that  the  formula  has  no  solution. 
If  the  formula  has  no  variables,  then  exit  and  report  that  the  formula  has  a  solution. 
(The  current  assignment  of  values  to  variables  is  a  solution  to  the  original  formula.) 
Otherwise,  select  the  first  variable  that  does  not  yet  have  a  value.  Generate  two 
sub-formulas  by  assigning  each  possible  value  to  the  selected  variable.  Solve  the  sub¬ 
formulas  recursively.  Report  a  solution  if  any  sub-formula  has  a  solution,  otherwise 
report  no  solution. 

Unit  Clause  Backtracking  [422].  This  algorithm  is  the  same  as  simple 
backtracking  except  for  how  variables  are  selected.  If  some  clause  contains  only  one 
of  the  unset  variables  then  select  that  variable  and  assign  it  a  value  that  satisfies 
the  clause  containing  it;  otherwise,  select  the  first  unset  variable. 

In  practice,  this  improved  variable  selection  often  results  in  much  faster  back¬ 
tracking  [34]. 
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Clause  Order  Backtracking  [64].  This  algorithm  is  the  same  as  simple 
backtracking  except  for  how  variables  are  selected.  If  this  setting  does  not  solve 
the  formula,  then  select  the  first  clause  that  can  evaluate  to  both  true  and  false 
depending  on  the  setting  of  the  unset  variables.  Select  variables  from  this  clause 
until  its  value  is  determined. 

By  setting  only  those  variables  that  affect  the  value  of  clauses,  this  algorithm 
sometimes  avoids  the  need  to  assign  values  to  all  the  variables.  The  algorithm 
as  stated  finds  all  the  solutions,  but  in  a  compressed  form.  The  solutions  come  in 
cylinders,  where  some  variables  have  the  value  “don’t  care.”  Thus,  a  single  solution 
with  unset  variables  represents  the  set  of  solutions  obtained  by  making  each  possible 
assignment  to  the  unset  variables. 

Probe  Order  Backtracking  [430].  This  algorithm  is  the  same  as  simple 
backtracking  except  for  how  clauses  are  selected.  Temporarily  set  all  the  unset 
variables  to  some  predetermined  value.  Select  the  first  clause  that  evaluates  to  false 
with  this  setting.  Return  previously  unset  variables  back  to  unset  and  continue  as 
in  clause  order  backtracking. 

For  practical  formulas  one  should  consider  adding  the  following  five  refinements 
to  probe  order  backtracking:  stop  the  search  as  soon  as  one  solution  is  found, 
carefully  choose  the  probing  sequence  instead  of  just  setting  all  variables  to  a  fixed 
value  [346,  484,  488],  probe  with  several  sequences  at  one  time  [69,  70],  carefully 
select  which  variable  to  set  [69,  70],  use  resolution  when  it  does  not  increase  the 
input  size  [166].  The  sixth  best  prize  winning  entry  in  the  1992  SAT  competition 
used  an  improvement  on  probe  order  backtracking  [70]. 

Franco  [165]  noticed  that  a  random  assignment  solves  a  nonzero  fraction  of 
the  formulas  in  the  average  1-SAT  model  when  pn  is  large  compared  to  In  m.  Sim¬ 
ple  uses  of  that  idea  does  not  lead  to  good  average  time  [430],  but  combining  the 
idea  with  clause  order  backtracking  leads  to  probe  order  backtracking,  which  is  fast 
when  pn  is  above  In  m.  Probe  order  backtracking  appears  to  have  some  similarities 
to  one  method  that  humans  use  in  problem  solving  in  that  it  focuses  the  algorithm  s 
attention  onto  aspects  of  the  problem  that  are  causing  difficulty,  i.e.,  setting  vari¬ 
ables  that  are  causing  certain  clauses  to  evaluate  to  false.  For  the  same  reason  it 
is  somewhat  similar  to  some  of  the  incomplete  searching  algorithms  discussed  in 
Section  7. 

Shortest  Clause  Backtracking.  This  algorithm  is  the  same  as  clause  order 
backtracking  except  for  the  clause  selected.  In  this  case,  select  the  shortest  clause. 

The  corresponding  idea  for  constraint  satisfaction  is  to  first  set  a  variable  in 
the  most  constraining  relation.  This  idea  is  quite  important  in  practice  [34]. 

Jeroslow-Wang  [283].  A  backtrack  search  can  sometimes  be  terminated  early 
by  checking  whether  the  remaining  clauses  can  be  solved  by  a  Linear  Programming 
relaxation  (see  Sections  9.2  and  9.3).  An  implementation  of  this  idea  can  be  expen¬ 
sive.  Jeroslow  and  Wang  have  proposed  a  simpler  and  effective  technique  that  is 
similar  in  spirit.  The  idea  is,  before  splitting,  to  apply  a  procedure  that  iteratively 
chooses  the  variable  and  value  which,  in  some  sense,  maximizes  the  chance  of  sat¬ 
isfying  the  remaining  clauses.  The  procedure  does  not  backtrack  and  is,  therefore, 
reasonably  fast.  Assignments  determined  by  the  procedure  are  temporarily  added 
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to  the  current  partial  truth  assignment.  K  the  procedure  succeeds  in  eliminating 
all  clauses  then  the  search  is  terminated  and  the  given  formula  is  satisfiable.  Oth¬ 
erwise,  the  procedure  fails,  control  is  passed  to  the  split,  temporary  assignments 
are  undone,  and  backtracking  resumes. 

The  choice  of  variable  and  value  at  each  iteration  maximizes  the  weight  w{Sij) 
where,  for  a  subset  of  clauses  S,  w{S)  =  J^ces  ^  ^  1  <  J  ^ 

Sij  is  the  subset  of  remaining  clauses  containing  variable  Vj  as  a  positive  literal  if 
i  =  0  and  as  a  negative  literal  if  i  =  1.  The  length  of  clause  C,  denoted  [Cl  above, 
is  the  number  of  literals  that  are  not  falsified  by  the  current  partial  assignment  and 
the  sum  is  over  clauses  that  are  not  satisfied  by  the  current  partial  assignment.  The 
weight  given  above  may  be  compared  to  that  given  by  Johnson  in  [285]  (see  also, 
Other  Non- Backtracking  Heuristics  below). 

6.3.  Backtracking  and  Resolution.  Some  algorithms  have  adapted  ideas 
inspired  by  resolution  to  splitting  algorithms.  For  example,  from  the  resolution 
view  point,  pure  literals  are  interesting  in  that  they  lead  to  a  single  sub-formula 
that  is  no  more  complex  than  the  original  formula,  while  from  the  perspective  of 
splitting,  pure  literals  lead  to  two  sub-formulas,  but  the  solutions  to  the  sub-formula 
where  the  literal  has  the  value  false  are  a  subset  of  the  one  where  the  literal  has  the 
value  true.  Therefore,  the  original  formula  has  a  solution  if  and  only  if  the  formula 
associated  with  the  true  literal  does. 

The  Pure  Literal  Rule  Algorithm  [201].  Select  the  first  variable  that  does 
not  have  a  value.  (If  all  variables  have  values,  then  the  current  setting  is  a  solution 
if  it  satisfies  all  the  clauses.)  K  some  value  of  the  selected  variable  results  in  all 
clauses  that  depend  on  that  variable  having  the  value  true,  then  generate  one  sub¬ 
formula  by  assigning  the  selected  variable  the  value  that  makes  its  literals  true. 
Otherwise,  generate  a  sub-formula  for  both  values  of  the  selected  variable.  Solve 
the  one  or  two  sub-formulas. 

6.4.  Clause  Area.  A  clause  with  I  distinct  literals  leads  to  the  fraction  1/2^ 
of  the  possible  variable  settings  not  being  solutions.  One  can  think  of  the  clause  as 
blocking  out  area  1/2^  on  the  Venn  diagram  for  the  formula.  Iwama  showed  that 
combining  this  idea  wdth  inclusion-exclusion  and  careful  programming  leads  to  an 
algorithm  which  runs  in  polynomial  average  time  when  p  >  y^(lnm)/n  [280].  If 
the  sum  of  the  area  of  all  clauses  is  less  than  1,  then  some  variable  setting  leads 
to  a  solution.  This  idea  works  particularly  well  with  shortest-clause  backtracking 
since  that  algorithm  tends  to  eliminate  short  clauses.  See  [170]  for  a  probabilistic 
analysis  of  this  idea.  No  average-time  analysis  has  been  done. 

6.5.  Improved  Techniques  for  Backtracking.  This  section  considers  some 
refinements  that  can  be  added  to  the  basic  backtracking  and  resolution  techniques. 
Several  of  these  are  similar  to  techniques  that  have  already  been  discussed. 

Branch  Merging.  This  is  complementary  to  preclusion.  Backtracking  is 
frequently  used  on  problems  such  as  the  n-queens  problem  where  there  is  a  known 
symmetry  group  for  the  set  of  solutions.  In  such  cases  many  search  trees  possess 
equivalent  branches  which  can  be  merged  to  reduce  search  effort  [34,  544].  The  use 
of  the  symmetry  group  can  greatly  speed  up  finding  the  solutions.  See  [72,  73]  for 
examples  from  the  field  of  group  theory.  Brown,  Finklestein,  and  Purdom  [51,  52] 
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gave  additional  problems  that  arise  in  making  the  backtracking  techniques  work 
with  a  backtracking  algorithm  which  needs  to  set  variables  in  different  orders  on 
different  branches  of  the  search  tree. 

Search  Rearrangement.  This  is  also  known  as  most- constrained  search  or 
nonlexicographic  ordering  search.  When  faced  with  several  choices  of  extending  a 
partial  solution,  it  is  more  efficient  to  choose  the  one  that  offers  the  fewest  alter¬ 
natives  [34].  That  is,  nodes  with  fewer  successors  should  be  generated  early  in  the 
search  tree,  and  nodes  with  more  successors  should  be  considered  later.  The  ver¬ 
tical  (variable)  ordering  and  horizontal  (value)  ordering  are  special  cases  of  search 
rearrangement  [54,  176,  422,  426,  429,  498].  The  rule  used  to  determine  which 
variable  to  select  next  is  often  called  the  branching  rule.  Many  researchers  are 
actively  investigating  the  selection  of  branching  variables  in  the  DP  procedures. 
Hooker  studied  the  branching  rule  and  its  effect  with  respect  to  particular  problem 
instances  [255].  Bohm  and  Speckenmeyer  experimented  with  branching  effect  with 
a  parallel  DP  procedure  implemented  on  an  MIMD  machine  [38].  Boros,  Hammer, 
and  Kogan  developed  branching  rules  that  aim  at  the  fastest  achievement  of  q-Horn 
structures  [41].  Several  particular  forms  of  search  rearrangement  were  discussed  in 
Section  6.2. 

From  2-SAT  to  General  SAT.  In  many  practical  applications,  the  con¬ 
straints  in  the  problems  are  coded  as  2-SAT  formulas.  In  SAT  problem  formulation, 
very  frequently  in  practical  applications,  many  of  the  constraints  will  be  coded  as 
2-SAT  clauses. 

An  important  heuristic  to  SAT  problem  solving  is  to  first  solve  2-SAT  clauses 
with  fast  polynomial  time  algorithms.  This  fast  operation  can  significantly  reduce 
the  search  space.  The  truth  assignment  to  the  rest  of  the  variables  can  be  handled 
with  a  DP  procedure.  This  idea  has  been  used  in  SAT  solver  Stamm  [69,  70], 
Gallo  and  Pretolani’s  2-SAT  relaxation  [69,  70,  419],  and  Larrabee’s  algorithm 
[326,  474].  Similar  ideas  to  solving  2-SAT  clauses  were  developed.  Eisele’s  SAT 
solver  uses  a  weighted  number  of  occurrences  whereas  occurrences  in  2-SAT  clauses 
count  more  than  other  occurrences  [69,  70].  Ddrre  further  added  a  limited  amount 
of  forward  checking  to  quickly  determine  2-SAT  formulas  in  the  Eisele-D5rre  SAT 
solver  [69,  70].  In  the  SAT  contest  [69,  70]  the  winning  programs  with  2-SAT 
solvers  were  slightly  slower  than  those  without. 

Similar  techniques  were  developed  that  use  Horn-SAT  relaxation  in  satisfiability 
testing  [108,  183].  In  Crawford’s  Tableau  [108],  Horn  clauses  are  separated  from 
non  Horn  clauses.  Based  on  the  DPL  procedure.  Tableau  applies  in  priority  the 
unit  clause  rule  and  if  necessary  branches  on  a  variable  selected  in  the  non  Horn 
clauses  using  three  successive  heuristics. 

Backmarking  and  Backjump.  When  a  failure  is  observed  or  detected,  the 
algorithm  simply  records  the  source  of  failure  and  jumps  back  to  the  source  of 
failure  while  skipping  many  irrelevant  levels  on  the  search  tree  [189,  191].  The 
more  effective  one’s  search  rearrangement  is,  the  less  need  there  is  for  backjumping. 
Good  search  orders  tend  to  be  associated  with  the  source  of  failure  being  one  level 
back. 
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Backtracking  with  Lookahead.  A  lookahead  processor  is  a  preprocessing 
filter  that  prunes  the  search  space  by  inconsistency  checking  [224,  225,  358,  387], 
Backtracking  with  lookahead  processing  is  performed  by  interplaying  a  depth-first 
tree  traversal  and  a  lookahead  tree  pruning  processor  that  deletes  nodes  on  the 
search  tree  whose  value  assignments  are  inconsistent  with  those  of  the  partial 
search  path.  Techniques  in  this  class  include  partial  lookahead,  full  lookahead 
[224,  225,  240],  forward  checking  [240,  240],  network-based  heuristics  [358,  128], 
and  discrete  relaxation  [224,  225,  310,  448], 

Backtracking  for  Proving  Non-Existence.  Recently,  Dubois,  Andre,  Boufkhad, 
and  Carlier  proposed  a  complete  SAT  algorithm,  CSAT  [151],  The  CSAT  was  de¬ 
veloped  for  the  proof  of  the  non-existence  of  a  solution.  The  algorithm  uses  a  simple 
branching  rule  and  a  local  processing  at  the  nodes  of  search  trees  (to  detect  further 
search  path  consistency  and  make  search  decision).  It  performed  efficiently  on  some 
DIMACS  benchmarks. 

Intelligent  Backtracking.  This  is  performed  directly  to  the  variable  that 
causes  the  failure,  reducing  the  eflFect  of  thrashing  behavior.  Methods  in  this  cat¬ 
egory  include  dependency-directed  backtracking  [493,  145],  revised  dependency- 
directed  backtracking  [411],  simple  intelligent  backtracking  [178],  and  a  number  of 
simplifications  [56,  119,  120,  121,  123,  125,  190,  240,  449]. 

Freeman  [175]  recently  present  an  intelligent  backtracking  algorithm,  POSIT, 
for  PrOpositional  Satlstiability  Testbed.  In  this  algorithm  he  used  Mom’s  heuristic, 
detecting  failed  literals,  and  minimizing  constant  factors  to  speed  up  backtracking 
search. 

Some  effort  was  devoted  to  the  development  of  backtracking-oriented  program¬ 
ming  languages,  special-purpose  computer  architectures,  and  parallel  processing 
techniques: 

Macro  Expansion.  In  some  applications  of  backtracking  that  require  rela¬ 
tively  little  storage,  this  method  can  be  used  to  decrease  the  running  time  of  the 
program  by  increasing  its  storage  requirements.  The  idea  is  to  use  macros  in  as¬ 
sembly  '  nguage  in  such  a  way  that  some  work  is  done  at  assembly  time  instead  of 
many  times  at  run  time.  This  increases  the  speed  at  which  nodes  are  processed  in 
the  tree  [34]. 

Backtrack  Programming.  Much  w^ork  has  focused  on  developing  a  new 
programming  language  for  backtracking  search.  This  includes  the  sequential^  Pro¬ 
log  programming  language  [94,  496],  Prolog  with  intelligent  backtracking  scheme 
[58,  321,  410],  and  logic  programming  [247]. 

Special-Purpose  Architectures.  Special-purpose  hardware  machines  were 
built  to  prune  search  space  [224,  225,  371],  perform  backtracking  search,  and  do 
AI  computations  [528,  529,  531]. 


^There  is  no  backtracking  mechanism  in  parallel  Prolog  programming  languages. 
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Parallel  Processing.  Many  parallel  processing  techniques  have  been  devel¬ 
oped  to  speed  up  search  computation  [23,  99,  528,  341,  532,  342,  340,  249, 
350,  514,  369,  458,  529,  547]. 


Branch  and  bound.  Also  known  as  ordered  depth-first  search.  Select  a 
variable.  For  each  possible  value  of  the  variable  generate  a  sub-formula  and  compute 
some  quick  to  compute  upper  bound  on  the  quality  of  the  solution  of  the  sub¬ 
formula.  Solve  recursively  all  sub-formulas  except  those  that  have  a  cost  above  that 
of  the  best  solution  that  has  been  found  so  far.  Branch  and  bound  is  recognized  as 
a  generalization  of  many  heuristic  search  procedures  such  as  AO*,  SSS*,  B*, 
alpha-beta,  and  dynamic  programming  algorithm  [6,  341,  553,  536,  534,  533, 
326,  331,  339,  407,  265,  553,  552,  546,  551]. 

6.6.  Some  Remarks  on  Complexity.  The  worst-case  time  for  all  known 
SAT  algorithms  is  exponential  in  the  first  power  of  the  input  size.  The  naive 
algorithm  that  tries  every  variable  setting  requires  time  2^  for  n  variable  formulas. 
For  Z-SAT,  the  best  known  bound  on  worst-case  complexity  has  been  worked  down 
from  1.618""  [382]  to  slightly  below  1.5""  obtained  by  Schiermeyer  [461,  462].  Other 
work  on  the  topic  is  given  in  [192]. 

As  with  other  NP-complete  problems  there  are  no  exponential  lower  bound 
results  for  SAT.  However,  it  has  been  proven  that  all  resolution  algorithms  need 
time  that  is  exponential  in  the  first  power  of  the  input  size  [231,  86,  519].  No 
such  lower  bound  analyses  have  been  done  on  splitting-based  algorithms. 

For  a  comprehensive  treatment  of  the  complexity  of  propositional  proofs,  see  a 
recent  survey  by  Urquhart  [522]. 


7.  Local  Search 

Local  search  is  a  major  class  of  discrete,  unconstrained  optimization  proce¬ 
dures  that  can  be  applied  to  a  discrete  search  space.  Such  procedures  can  be  used 
to  solve  SAT  by  introducing  an  objective  function  that  counts  the  number  of  un- 
satisfiable  {CNF)  or  satisfiable  {DNF)  clauses  and  solving  to  minimize  the  value  of 
this  function  [207,  211,  212,  220,  400,  469]. 

In  this  section,  we  summarize  the  basic  framework,  including  a  search  space 
model,  four  essential  components,  and  present  ideas  used  in  the  early  development 
of  local  search  algorithms  for  the  SAT  problem.  We  then  describe  randomized 
local  search,  randomized  local  search  with  trap  handling,  and  greedy  local  search 
in  detail. 


7,1.  Framework.  Local  search,  or  local  optimization,  is  one  of  the  primitive 
forms  of  continuous  optimization  applied  to  a  discrete  search  space.  It  was  one 
of  the  early  techniques  proposed  to  cope  with  the  overwhelming  computational 
intractability  of  NP-hard  combinatorial  optimization  problems.  Local  search  can 
be  very  efficient  in  favorable  cases.  How’ever,  when  applied  to  SAT  problems,  local 
search  algorithms  only  work  for  formulas  that  have  solutions,  and  even  then  there 
is  no  guarantee  that  they  will  work. 

Given  a  minimization  (maximization)  problem  with  objective  function  /  and 
feasible  region  R,  a  typical  local  search  procedure  requires  that,  with  each  solution 
point  Xfc  G  i?,  there  is  a  predefined  neighborhood  N{yik)  C  R-  Given  a  current 
solution  point  xjt  G  R,  the  set  iV(xjt)  is  searched  for  a  point  xjt+i  with  /(xa:+i)  < 
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/(xfc)  (/(xfe+i)  >  /(xjt)).  If  such  a  point  exists,  it  becomes  the  new  current  solution 
point,  and  the  process  is  iterated.  Otherwise,  Xk  is  retained  as  a  local  optimum  with 
respect  to  N{xk).  Then,  a  set  of  feasible  solution  points  is  generated,  and  each  of 
them  is  “locally”  improved  within  its  neighborhood.  To  apply  local  search  to  a 
particular  problem,  one  needs  only  to  specify  the  neighborhood  and  the  procedure 
for  obtaining  a  feasible  starting  solution. 

Local  search  can  be  efficient  for  two  reasons.  First,  at  the  beginning  of  search, 
a  full  assignment  is  assigned  to  all  the  variables  in  the  search  space.  Search  efforts 
are  focused  on  a  single  path  in  the  search  space.  Second,  local  search  refines  for 
improvement  within  its  local  neighborhood  using  a  testing  for  improvement  and, 
if  there  is  any  improvement,  takes  an  action  for  improvement.  Since  the  objective 
function  has  a  polynomial  number  of  input  numbers,  both  testing  and  action  can 
be  done  efficiently.  Little  effort  is  needed  to  generate  the  next  solution  point.  A 
major  weakness  of  local  search  is  that  the  algorithm  has  a  tendency  to  get  stuck  at 
a  locally  optimum  configuration,  i.e.,  a  local  minimum. 

Greedy  local  search  pursues  only  paths  where  every  step  leads  to  an  improve¬ 
ment,  but  this  leads  to  a  procedure  that  becomes  stuck  much  more  often  than  the 
randomized  local  search.  Greedy  local  search  procedure  gets  stuck  in  flat  places  as 
well  as  at  local  minima. 

Many  search  techniques,  such  as  statistical  optimization  [74,  464],  simulated 
annealing  [306],  stochastic  evolution  [454],  and  conflict  minimization  [206,  377, 
482,  488],  are  either  local  search  or  variations  of  local  search.  For  most  search 
problems  encountered,  in  terms  of  computing  time  and  memory  space,  local  search 
often  achieves  many  orders  of  magnitude  of  performance  improvement  over  conven¬ 
tional  techniques  such  as  Branch-and-Bound  [211,  212,  433,  484,  488]. 

7.2.  A  Three-Level  Search  Space  Model.  A  large  number  of  read  exper¬ 
imental  data  suggest  that  there  are  several  typical  local  minimum  structures  (see 
Figure  11).  A  valley  and  a  basin  are  ideal  cases  that  one  can  find  a  global  minimum 
quickly.  Local  search  and  the  related  heuristics  can  handle  a  terrace  and  a  plateau 
without  much  difficulty.  The  most  difficult  situation  is  a  trap  where  a  group  of  local 
minima  is  confined  in  a  “well.”  The  search  process  walks  around  the  set  of  local 
minima  periodically  and  cannot  get  away  without  special  mechanism.  In  general 
there  may  be  many  traps  in  a  search  problem.  The  characteristics  of  a  trap  are 
closely  related  to  the  search  algorithm,  the  objective  function  used,  and  the  search 
space  structure. 

Further  observations  suggest  that  a  search  space  may  be  roughly  divided  into 
several  different  levels,  depending  on  the  problem  structures.  A  three-level  search 
space  structure  was  proposed  during  the  development  of  the  5 AT  1.5  algorithm  (see 
Section  7.7)  [211,  222].  An  informal  example  of  the  model  is  given  in  Figure  12. 
In  the  model,  a  search  space  is  roughly  viewed  in  three  levels:  top  level,  middle 
level,  and  bottom  level.  The  top  level  is  the  upper  open  portion  of  the  search 
space  with  smoothing  edges.  Most  optimization  algorithms  can  descend  quickly  in 
the  top  level  and  thus  perform  quite  well.  The  middle  level  is  the  middle  portion 
of  the  search  space  where  there  are  relatively  “big  mountain  peaks.”  During  the 
descent,  the  search  process  may  encounter  problems  and  it  may  have  to  use  some 
tunneling  and  random  heuristics  (see  Section  7.3)  to  proceed.  The  bottom  level 
is  the  bottom  portion  of  the  valleys  (particular  the  lowest  valley)  where  there  are 
many  traps.  When  local  search  falls  into  a  trap  it  may  become  locked  into  a  loop 
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Figure  11.  There  are  a  number  of  local  minimum  structures.  A 
trap  is  a  “well”  of  local  minima  and  is  difficult  to  deal  with. 


of  local  minima.  Most  algorithms  do  not  succeed  in  this  stage  and  have  difficulty 
continuing. 

For  the  SAT  problem,  with  high  probability,  a  greedy  local  search  will  fall  into 
a  trap  much  more  easily.  In  this  case  some  variables  are  updated  very  quickly.  The 
related  clauses  oscillate  between  the  sat  and  unsat  states.  The  search  is  limited  to 
these  states.  Without  any  help,  there  is  little  chance  of  getting  out  to  explore  other 
states. 

The  above  observations  suggest  to  use  multiphase  search  to  handle  the  NP-hard 
problems  [211,  214,  222].  That  is  we  may  use  an  open  search  in  the  top  level,  a 
peak  search  for  searching  “coarse”  peak  structures  in  the  middle  level,  and  a  trap 
search  for  tracking  “fine”  rugged  trap  surface  structures  in  the  valleys. 

The  major  heuristics  used  in  local  search  are  discussed  in  the  next  subsection. 

7.3.  Four  Components  in  Local  Search.  A  number  of  efficient  local  search 
algorithms  for  the  SAT  problem  have  been  developed  since  1987.  Previous  expe¬ 
rience  indicated  that  the  greedy  local  search  strategy  alone  can  not  be  adapted  to 
perform  well  on  SAT  formulas.  Past  lessons  showed  that  the  following  four  com¬ 
ponents  are  crucial  to  the  development  of  an  efficient  local  search  algorithm  for 
the  satisfiability  and  NP-hard  problems.  They  are:  (1)  the  min-conflict  heuristics, 
(2)  the  best-neighbor  heuristics,  (3)  the  random  value/variable  selection  heuristics, 
and  (4)  the  trap  handling  heuristics. 
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Figure  12.  An  informal  example  of  the  three-level  search  space 
model.  A  search  process  would  go  through  an  open  search  in  the 
upper  open  portion  of  the  search  space,  a  peak  search  in  the  middle 
portion  of  the  search  space,  and  a  trap  search  in  the  valley  portion 
of  the  search  space. 


1.  The  Min-Confiict  Heuristics. 

Different  forms  of  min-conflict  heuristics  were  proposed  during  1985  and  1987 
for  solving  the  SAT  and  CSP  problems  [206]."^  The  min-conflict  heuristics  aim  at 
performing  local  conflict  minimization  in  Boolean,  discrete,  and  real  spaces  [451]: 
5 


Min-Conflict  Heuristic  (Boolean  Space)  [206].  Multiple  values  to  be 
assigned  to  a  variable  are  represented  by  a  vector  of  Boolean  labels.  Each  Boolean 
label,  either  “1”  or  “0,”  indicates  the  variable’s  instantiation  to  a  specific  value. 
Two  labels  are  conflicting  if  their  values  do  not  satisfy  the  given  constraint.  The 
conflicts  (due  to  an  assignment)  are  formulated  as  a  set  of  objective  functions.  The 
objective  functions  are  minimized  by  changing  values  assigned  to  the  labels. 

Min-Conflict  Heuristic  (Discrete  Space)  [206].  Interrelated  objects  are 
chosen  as  variables.  Two  variables  are  conflicting  if  their  values  do  not  satisfy  the 
given  constraint.  The  number  of  conflicts  (due  to  an  assignment)  is  formulated  in 
an  objective  function.  The  objective  function  is  iteratively  minimized  by  changing 
values  assigned  to  the  variables. 


the  early  days  min- conflict  was  variously  called  inconsistency  removing,  inconsistency 
resolution,  conflict  resolution,  enforce  local  consistency,  and  local  conflict  minimization  [206]. 
Later,  Minton  shortened  these  words  into  a  concise  term:  min-conflict. 

^The  min-conflict  heuristics  also  work  in  real  space  (see  examples  in  [209,  210,  217]  and 
Section  8). 
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Min-Conflict  for  SAT  [207,  220,  211,  209,  212],  Using  inconsistency  as 
objective  [206],  the  objective  function  for  the  SAT  problem  gives  the  number  of 
unsatisfied  clauses.  A  CNF  is  true  if  and  only  if  the  objective  function  takes  the 
global  minimum  value  0  on  the  corresponding  solution  point. 

This  objective  function  is  the  basis  of  the  design  of  the  SATl,  SAT2,  SAT3, 
and  CJS’AT algorithms  [211,  209,  220,  212,  469]. 

Performance.  The  min-conflict  heuristics  have  been  applied  to  solve  the  SAT 
and  CSP  problems  since  1985  [206,  207,  482,  481,  220,  211].  They  showed 
significant  performance  improvements  when  compared  to  traditional  backtracking 
search  algorithms.  The  eflFectiveness  of  min-conflicts  heuristic  was  further  observed 
by  Russel  and  Norvig  [451],  Kumar  [319,  320],  Johnson  [293],  Minton  et  al.  [377], 
and  Selman  et  al.  [469]. 

2.  The  Best-Neighbor  Heuristics. 


Local  search  proceeds  by  taking  any  feasible  solution  point  that  reduces  the 
objective  function.  Among  many  neighboring  feasible  solution  points,  local  search 
does  not  take  into  account  its  neighbors’  relative  performance  with  respect  to  the 
objective  function. 

Best-Neighbor  Heuristic  [211,  212,  217].  A  greedy  algorithm  selects  the 
best  neighbor  that  yields  the  minimum  value  to  the  objective  fimction  and  takes 
this  best  neighbor  direction  as  the  descent  direction  of  the  objective  function. 

In  a  real  search  space,  continuous  optimization  algorithms  can  find  the  best 
neighbor  feasible  solution  efficiently.  A  number  of  local  and  global  optimization 
algorithms  have  been  developed  to  solve  the  SAT  problem  [211,  212,  217].  The 
first  version  of  the  GS'AT  algorithm  was  proposed  as  a  greedy  local  search  algorithm 
[469]. 


Performance.  A  greedy  local  search  alone  may  become  stuck  at  local  minima 
much  more  often  and  therefore  may  not  be  efficient  in  practice.  Therefore,  the 
best  neighbor  heuristic  should  be  used  in  conjunction  with  random  value /variable 
selection  and  trap  handling  heuristics  described  next. 

3.  The  Random  Value/ Variable  Heuristics. 

Random  value  assignment  and  random  variable  selection  techniques  are  funda¬ 
mental  to  the  design  of  an  effective  local  search  algorithm  for  NP-hard  problems 
[226,  215]. 

Random  Flip  Heuristic  [206,  211,  220,  209,  212]:  Randomly  flip  the  truth 
values  of  1  <  fc  <  n  variables  in  the  SAT  formula. 

This  simple  heuristic  was  first  implemented  in  several  SATl  algorithms  as  local 
handler{s)  in  1987.  It  has  been  proven  to  be  effective  in  improving  the  performance 
of  greedy  local  search  algorithms  [211,  220,  209,  212]. 
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During  1988  to  1990,  a  similar  heuristic,  random  swap,  was  used  to  develop 
local  search  algorithms  for  the  n-queen  problems.  It  showed  significant  performance 
improvement  for  solving  large-size  n-queen  problems  [206,  483,  484,  485,  488]. 

Random  Value  (Assignment)  Heuristics  [206,  211,  220,  212,  484,  485, 
488].  These  include:  randomly  select  a  value  that  generates  the  minimum  number 
of  conflicts;  randomly  select  a  value  if  there  is  a  symmetry  (i.e.,  more  than  one 
value  producing  the  same  performance);  and  randomly  select  a  value  for  conflict 
minimization  when  local  minima  are  encountered. 

Random  Variable  (Selection)  Heuristics  [206,  211,  220,  212].  There 
are  two  important  heuristics: 

1.  Any  Variable  Heuristic:  select  any  variable  randomly. 

2.  Bad  Variable  Heuristic:  select  a  variable  from  the  set  of  conflicting  vari¬ 
ables  randomly. 

The  random  variable  selection  heuristic  is  one  of  the  most  important  heuristics 
in  the  design  of  local  search  algorithms  for  NP-hard  problems.  It  was  first  used 
in  the  local  search  solution  for  the  SAT  problem  [207]  and  then  used  for  the  local 
search  solution  for  the  n-queen  problems  [484]. 

Conflicting  variables  in  the  SAT  problem  contribute  to  the  unsatisfied  clauses. 
Accordingly  we  have: 

Bad  Variable  Heuristic  for  the  SAT  problem  [211,  212,  484,  485, 
488,  483,  481]:  randomly  select  a  variable  in  the  unsatisfied  clauses  for  conflict 
minimization. 

The  bad  variable  heuristic  was  first  implemented  to  solve  the  large  size  n-queen 
problems  during  1988  to  1990  [484,  485,  488,  483,  481]  and  was  implemented  in 
the  algorithm  in  1990  [211,  212].  The  bad  variable  heuristic  was  indepen¬ 

dently  developed  by  Papadimitriou  for  the  2-SAT  problem  in  1991  [399]  and  was 
used  in  the  WS AT  algorithm  by  Selman  et  al  in  1994  [472]. 

Partial/Pre-  Random  Variable  Selection  Heuristics  [206,  211,  220, 
212].  Partial  variable  random  selection  makes  use  of  partial  or  alternating  variable 
selection  techniques.  Variants  of  partial  random  selection  include  partial  and  alter¬ 
nating  selection  of  conflicting  and  non-conflicting  variables,  a  combination  of  partial 
deterministic  and  partial  random  variable  selection,  partial  interleaved  selection  of 
the  different  search  phases,  and  partial  random  selection  with  meta-heuristic  con¬ 
trol.  The  simplest  selection  strategies  include:  select  a  variable  deterministically 
(randomly)  and  select  another  variable  randomly  for  conflict  minimization;  select  a 
variable  deterministically  (randomly)  from  the  set  of  conflicting  variables  and  select 
another  variable  randomly  for  conflict  minimization;  select  a  variable  deterministi¬ 
cally  and  select  another  variable  randomly  from  the  set  of  conflicting  variables  for 
conflict  minimization;  during  certain  periods  of  search,  select  a  variable  determinis¬ 
tically  (randomly)  and  select  another  variable  randomly  for  conflict  minimization; 
during  certain  periods  of  search,  select  a  variable  deterministically  (randomly)  from 
the  set  of  conflicting  variables  and  select  another  variable  randomly  for  conflict  min¬ 
imization;  during  certain  periods  of  search,  select  a  variable  deterministically  and 
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select  another  variable  randomly  from  the  set  of  conflicting  variables  for  conflict 
minimization. 

Partial  Random  Variable  Selection  Heuristics  for  the  SAT  problem; 
a  variable  may  be  selected  from  the  unsatisfied  clauses  in  a  random,  partially  al¬ 
ternating,  partially  periodic,  or  partially  interleaving  order. 

The  partial  and  pre-  random  variable  selection  heuristics  were  implemented 
in  the  SAT3  algorithm  in  1990  [211,  212]  and  were  used  to  solve  the  large  size 
n-queen  problems  around  1990  [484,  485,  488,  483,  481].  A  similar  heuristic  to 
the  partial  random  variable  selection,  random  walk,  was  developed  by  Selman, 
Kautz,  and  Cohen  independently  in  1994  [472]. 

Performance.  Random  and  partial  variable  selection  heuristics  were  intro¬ 
duced  in  the  design  of  SATl,  QS2,  QS3,  and  QS4  algorithms  [207,  220,  211, 
212,  482,  484,  485,  488].  They  can  overcome  the  weakness  of  the  greedy  local 
search  algorithms.  Compared  to  greedy  local  search,  they  can  offer  many  orders  of 
magnitude  of  performance  improvements  in  terms  of  computing  time,  solving  hard 
and  large  satisfiability  problems  and  multi-million  n-queens  problems  in  seconds 
[211,  212,  484,  488].  They  were  used  in  the  design  of  SAT1.5,  SAT2,  and  SAT3 
algorithms  [211,  212]. 

Selman  et  al.  have  recently  developed  and  applied  a  number  of  random  variable 
selection  heuristics  to  improve  the  performance  of  the  greedy  GSAT  algorithm 
[472]. 

4.  The  Trap  Handling  Heuristics. 

The  search  is  a  process  of  combating  local  minima.  When  the  search  process 
is  approaching  the  final  search  stage,  trap  handling  heuristics  are  needed  to  cope 
with  local  minima  and  traps  (see  Sections  7.2  and  7.7). 

Tunneling  Heuristic  [220,  212,  542]:  Change  the  value  of  a  variable  if  it 
does  not  change  the  value  of  the  objective  function. 

Tunneling  Heuristic  for  the  SAT  Problem  [220,  212,  469]:  Flip  the 
truth  value  of  a  variable  if  it  does  not  change  the  value  of  the  objective  function 
(see  Section  7.6). 

Local  Tracking  Heuristics  [211,  222].  Local  tracking  heuristics  are  used  to 
track  and  break  local  loops  (a  periodic  occurrence  of  a  set  of  local  minima) .  Several 
frequently  used  heuristics  include:  track  local  loop(s)  when  falling  into  a  trap;  give 
low  priority  to  flip  to  variables  in  a  local  minimum  loop;  give  high  priority  to  flip  to 
variables  that  lead  to  a  new  descending  direction;  lock  and  release  trapping  variables 
periodically,  adaptively,  or  statistically;  move  gently  in  a  trap  to  handle  fine  local 
structures;  move  strongly  in  a  trap  to  handle  coarse  local  structures;  jump  out  of  a 
trap  if  walking  inside  it  sufficiently  long. 

Multiphase  Search  Heuristics  [207,  211,  485,  488,  212,  433,  542,  222]. 
Multiphase  heuristics  are  a  part  of  multispace  search  heuristics  [213,  226,  215]. 
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They  have  been  developed  to  adapt  to  the  different  phases  of  a  search  process:  per¬ 
form  a  poor  initial  search  and  then  a  serious  local  search  for  conflict  minimization; 
perform  a  good  initial  search  and  then  a  serious  local  search  for  conflict  minimiza¬ 
tion;  perform  a  good  initial  search,  then  a  rough  local  search,  and  a  serious  local 
search  for  conflict  minimization;  perform  an  initial  search,  and  then  a  rough  local 
search  and  a  serious  local  search  alternatively  for  conflict  minimization;  perform  a 
rough  initial  search,  then  a  coarse  local  search,  and  finally,  a  fine  local  search  for 
conflict  minimization. 

Multispace  Search  Heuristics  [213,  226,  215].  Structural  multispace  op¬ 
erations  have  been  developed  that  empower  a  search  process  with  an  information 
flux  which  is  derived  from  a  sequence  of  stepwise  structural  transformations  (see 
Section  11.4).  These  include  multispace  scrambling,  extradimension  transition, 
search  space  smoothing,  multiphase  search,  local  to  global  passage,  tabu  search, 
and  perturbations.  They  can  disturb  the  environment  of  forming  local  minima  and 
facilitate  efficient  local  search  when  there  are  many  local  minima. 

Performance.  Trap  handling  heuristics  have  significantly  improved  the  search 
efficiency  of  the  SAT  1.5  algorithm  [211,  212,  222]  (see  Section  7.7).  Multiphase 
and  multispace  search  heuristics  have  been  applied  to  a  variety  of  practical  appli¬ 
cations  and  found  to  be  effective  [213,  211,  212,  215,  222,  433,  542,  485,  488]. 

7.4.  Boolean  Local  Relaxation.  Boolean  local  relaxation  may  be  viewed 
as  a  deterministic  local  search.  It  was  an  early  inconsistency  relaxation  technique 
developed  for  solving  the  constraint  satisfaction  and  satisfiability  problems.  For 
a  variable  having  a  domain  with  m  values,  m  Boolean  labels  are  used  to  indicate 
the  variables’  instantiation  to  the  particular  Boolean  values.  An  assignment  may 
produce  conflicts  which  are  coded  in  a  set  of  Boolean  objective  functions  (one  for 
each  label).  The  objective  function  for  the  zth  variable  and  Hh  label,  is  defined 
[206]:® 

n  m 

(7.1)  ^  ^  hyQ  A  Cij{qjp) 

j=i  p=i 

where  Cij(g,p)  is  a  constraint  between  labels  and  Ij^p.  Note  that  the  right-hand 
side  of  Eq.  (7.1)  is  a  formula  with  extended  literals. 

The  Boolean  relaxation  is  a  local  conflict  minimization  process  (Figure  13) 
[206].  During  each  iteration,  the  algorithm  checks  each  variable  for  every  label  and 
iteratively  minimizes  the  objective  functions  by  flipping  bits  (truth  values)  assigned 
to  the  labels:  If  the  objective  function  does  not  change,  keep  it;  If  the  objective 
function  reduces,  keep  the  best  (update  the  label)  and  report  the  inconsistency 
status  globally.  The  iteration  will  terminate  once  the  inconsistency  signal  turns  off. 

The  Boolean  local  relaxation  algorithm  was  suitable  to  VLSI  implementation. 
During  1985  to  1988,  several  parallel  algorithms  and  architectures,  such  as  DRA2, 
DRA3,  and  DRA5  were  implemented  to  speed  up  CSP/SAT  computations  [206, 
224,  541,  225].  Furthermore  they  were  combined  with  backtracking  search  for 
CSP/SAT  applications  [206]. 


®In  [206]  the  objective  function  was  defined  for  label  directly. 
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procedure  DRA() 
boolean  inconsistency] 
begin 

inconsistency  :=  TRUE] 
k:=  0; 

while  inconsistency  =  TRUE  do 
begin 

inconsistency  :=  FALSE; 
for  variable  i  :=  1  to  n 
for  label  g  :=  1  to  m 
begin 

:=  evaluate-objectiveJ“unction(/,C); 
/*  local  conflict  minimization  */ 
if  then  continue; 

if  /f <  flq  then  update  label  value; 
inconsistency  :=  TRUE; 

end; 

k  :=  k  -h  1; 
end; 
end; 

Figure  13.  DRA:  A  local  relaxation  algorithm. 


Because  of  its  iterative  local  conflict  minimization  and  its  direct  applications 
to  SAT/CSP,  Boolean  local  search  made  itself  a  predecessor  of  several  early  local 
search  algorithms  for  CSP  and  SAT  problems. 

7.5.  Constraint  Satisfaction,  Simulated  Annealing,  and  Complexity 
Study.  Early  work  on  constraint  satisfaction,  simulated  annealing,  and  complexity 
theory  contributed  significantly  to  the  original  development  of  local  search  algo¬ 
rithms  for  the  SAT  problem.  Four  notable  early  developments  are:  (1)  the  SATl 
algorithms,  (2)  the  n-queen  models  and  algorithms  for  scheduling  applications,  (3) 
a  simulated  annealing  algorithm,  and  (4)  a  2-SAT  algorithm. 

1.  The  SATl  Algorithms. 

Objective  functions  in  DRA  algorithms  were  defined  for  Boolean  labels.  Dur¬ 
ing  the  late  eighties,  Gu  [206]  observed  that  if  the  conflicts  from  all  the  Boolean 
objective  functions  were  formulated  in  one  objective  function,  then  the  global  min¬ 
imum  of  the  objective  function  would  correspond  to  a  conflict-free  solution  of  the 
given  CSP  problem.  Accordingly,  the  iterative  local  minimization  procedure  used 
in  the  Boolean  relaxation  would  become  a  local  search  procedure  to  minimize  the 
objective  function.  This  idea  led  directly  to  the  early  design  of  SATl  algorithms 
where  the  objective  function  was  defined  as  the  number  of  unsatisfied  clauses  oyer 
all  the  variables  [206,  207].  Thus,  the  global  minimum  of  the  objective  function 
corresponds  to  the  solution  of  the  SAT  problem.  Following  this,  Gu  developed  a 
number  of  randomized  local  search  algorithms  for  the  SAT  problem.  Furthermore 
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efficient  heuristics  (described  in  Section  7.3)  were  developed  to  improve  the  perfor¬ 
mance  of  the  local  search  algorithms.  Due  to  the  important  industrial  applications 
at  that  time,  the  effectiveness  of  the  SATl  algor  thms  was  tested  through  two  CSP 
benchmarks,  i.e.,  the  SAT  problem  and  the  n-v  .een  problems. 

An  important  industrial  application  for  SAT  is  VLSI  engineering.  The  SATl 
algorithm  family,  the  first  local  search  algorithm  for  SAT,  was  developed  for  the¬ 
oretical  study  and  VLSI  applications.  During  the  late  eighties,  there  was  little 
progress  in  the  theoretical  analysis  of  the  SATl  algorithms.  The  SATl  algorithm 
was  applied  to  VLSI  circuit  testing  and  synthesis.  All  these  problems  can  be  for¬ 
mulated  as  instances  of  the  SAT  problem  with  additional  performance  objectives. 
The  SATl  algorithm  was  found  to  be  efficient  for  many  problems  but  it  was  not 
able  to  handle  some  other  problems  since  many  VLSI  design  problems  are  NP-hard. 
There  is  only  one  optimum  solution  in  some  practical  applications. 

Another  important  application  area  for  the  SAT  problem  is  industrial  schedul¬ 
ing,  During  the  late  eighties,  IBM  and  NASA  were  working  on  a  number  of  impor¬ 
tant  scheduling  projects.  These  applications  involved  solving  large  size  scheduling 
problems  under  critical  spatial,  resource  and  timing  constraints.  The  scheduling 
problem  is  well-known  as  the  satisfiability  problem  since  the  SAT  problem  can 
characterize  an  existing  scheduling  problem  and  the  constraints  completely.  Signif¬ 
icant  local  search  solutions  to  the  scheduling  problems  were  derived  from  the  SATl 
algorithm  family.  Due  to  its  abstract  CTVF  formulation,  however,  the  SAT  problem 
was  not  able  to  provide  a  descriptive  geometric  model  that  was  able  to  demonstrate 
the  scheduling  operations  expressively. 

2.  A^-Queen  Scheduling  Models  and  the  QS  Algorithms. 

The  n-queen  problem  is  a  benchmark  for  constraint  satisfaction  problem.  Dur¬ 
ing  the  middle  and  late  eighties,  Gu  worked  on  various  n-queen  problem  models  for 
combinatorial  optimization  [206,  226].  He  found  that,  by  a  remarkable  coincidence, 
the  n-queen  model  represents  a  significant  model  for  scheduling  applications.  The 
underlying  structure  of  the  n-queen  problem,  represented  by  a  complete  constraint 
graph,  gives  a  relational  model  with  fully  specified  constraints  among  the  multiple 
objects  [206].  Variations  on  the  dimension,  the  objects’  relative  positions,  and  the 
weights  on  the  constraints  led  to  a  hyper-queen  problem  model.  The  hyper-queen 
problem  model  consisted  of  a  combination  of  several  simple  and  basic  models,  in¬ 
cluding: 

•  n— queen  problem:  the  base  model. 

•  k-n-queen  problem:  k  n-queen  patterns  superimposed  together.  When 
A;  =  n  we  have  a  special  case,  the  n^-queen  problem  model. 

•  m—n— queen  problem:  the  board  size  is  an  m  by  n  rectangular, 

•  t-n-queen  problem:  the  queens’  placement  follows  the  topological  con¬ 
straints. 

•  w-n-queen  problem:  the  constraints  from  queen  to  queen  are  weighted  to 
model  special  constraints. 

•  s— n-queen  problem:  the  model  requires  the  shortest  queen  placement. 

Based  on  the  n-queen,  the  hyper-queen  problem  can  model  the  object  composition, 
the  performance  criteria,  the  spatial,  timing,  and  resource  constraints  for  an  existing 
scheduling  problem.  This  made  the  n-queen  problem  a  general  model  for  a  wide 
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range  of  industrial  scheduling  problems  having  critical  performance  criteria.  By  a 
remarkable  coincidence,  the  models  of  several  difficult  scheduling  projects  at  that 
time  were  either  the  n-queen  or  the  /it/per-queen  problems  [498].  All  of  them 
required  efficient  solutions  to  the  n-queen  problems. 

Polynomial  time,  analytical  solutions  for  the  n-queen  problem  exist  but  they 
cannot  solve  the  general  search  problems  and  have  no  use  in  practice  [2,  10,  30, 
156,  251,  442].  The  scheduling  problems  modeled  by  various  hyper-queen  models 
have  specific  performance  criteria  and  are  known  to  be  NP-hard.  When  scheduling 
computational  tasks  to  multiprocessors,  for  example,  one  can  use  an  s-w-t-m-n- 
queen  model.  Let  m  denote  the  execution  time,  n  the  number  of  processors,  and  w 
the  individual  tasks’  execution/communication  times,  the  goal  is  to  place  the  task 
queens  onto  the  m  by  n  board  and  minimize  the  longest  execution  path,  following 
the  given  topological  constraints. 

The  hyper-queen  models  freed  the  original  n-queen  problem  from  its  puzzle 
game  background.  Many  practical  applications  of  the  n-queen  and  hyper-queen 
models  to  real  world  problems  have  been  found.  Local  search  solutions  for  various 
scheduling  applications  were  developed  during  the  later  eighties. 

Following  local  conflict  minimization  [206,  224],  a  QSl  algorithm  was  devel¬ 
oped  during  late  1987  and  was  implemented  during  early  1988.  It  was  the  first 
local  search  algorithm  developed  for  the  n-queen  problem  [206,  481,  482,  483]. 
Three  improved  local  search  algorithms  for  the  n-queen  problem  were  developed 
during  1988  to  1990  [319,  320,  293,  451].  QS2  is  a  near  linear-time  local  search 
algorithm  with  an  efficient  random  variable  selection  strategy  [484].  QS3  is  a  near 
linear-time  local  search  algorithm  with  efficient  pre-  and  random  variable  selection 
and  assignment  [484].  QSi  is  a  linear  time  local  search  algorithm  with  efficient 
partial  and  random  variable  selection  and  assignment  techniques  [485,  488].  Com¬ 
pared  to  the  first  local  search  algorithm  [206],  partial  and  random  variable  selec¬ 
tion/assignment  heuristics  have  significantly  improved  search  efficiency  by  orders  of 
magnitude.  (354,  for  example,  was  able  to  solve  3,000,000  queens  in  a  few  seconds. 

Three  years  after  releasing  the  QSl  algorithm,  Minton  et  al.  independently 
reported  a  similar  local  search  algorithm  for  the  n-queen  problem  [376,  377].  A 
major  difference  between  Minton’s  algorithms  and  Sosic  and  Gu’s  algorithms  was 
that  Minton’s  algorithm  w’as  a  one  dimensional  local  search  without  using  random 
heuristics. 

3.  A  Simulated  Annealing  Algorithm  for  Max-SAT. 

Motivated  by  the  method  of  simulated  annealing,  Hansen  and  Jaumard  [238] 
proposed  a  steepest  ascent,  mildest  descent  algorithm  for  the  maximum  satisfia¬ 
bility  (Max-SAT)  problem.  In  this  approach,  Hansen  and  Jaumard  focused  on  a 
local  change  and  defined  an  objective  function  based  on  a  switching  variable  and 
its  related  clauses.  The  objective  function  maximizes  local  compensation  for  each 
variable  which  can  be  used  for  solving  the  Max-SAT  problem.  The  objective  func¬ 
tion  can  not  be  used  for  the  SAT  problem  unless  another  objective  function  whose 
global  minimum  corresponds  to  a  solution  of  the  SAT  problem  is  given.  Further¬ 
more,  Hansen  and  Jaumard  used  local  optima  checking  to  handle  the  local  optimum 
and  found  it  by  providing  additional  guidance  to  the  search  direction. 

4.  Theoretical  Study  for  SATl  and  2-SAT. 
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During  the  early  ninties,  researchers  started  to  work  on  the  theoretical  analysis 
of  local  search  algorithms  for  CSP  and  SAT  problems.  In  1991  two  theoretical 
studies  that  focused  on  the  SAT  problem  were  reported.  Gu  and  Gu  took  three 
algorithms  (i.e.,  SATl.l,  SATl.2,  and  5AT1.3)  from  the  SATl  algorithm  family 
and  made  average  time  complexity  study  for  the  SAT  problem  [220]. 

During  the  study  of  the  complexity  of  a  certain  natural  generalization  of  SAT, 
Papadimitriou  gave  a  randomized  algorithm  for  the  2-SAT  problem  [399].  Further¬ 
more  Papadimitriou  showed  that  such  a  randomized  algorithm  finds  assignments 
for  2-SAT  instances  in  O(n^)  steps  with  probability  approaching  one,  where  n  is 
the  number  of  variables.  With  further  extensions  [401],  in  theory,  the  algorithm 
can  be  applied  to  solve  the  random  3-SAT  problems. 


Early  on,  local  search  method  for  the  large  size  n-queen  scheduling  problem 
attracted  great  attention  in  the  AI  area.  This  was  due  to  the  close  relationship 
between  CSP  and  SAT:  the  SAT  problem  is  a  special  case  of  CSP.  The  n-queen 
problem,  on  the  other  hand,  is  a  typical  benchmark  problem  in  CSP.  If  one  can  find 
an  efficient  (non-analytical)  search  algorithm  for  the  n-queen  problem,  then  the 
algorithm  can  be  directly  translated  to  an  efficient  algorithm  for  the  SAT  problem. 

Analytical  solutions  exist  for  the  n-queen  problem  with  n  greater  than  or  equal 
to  4  [10,  156,  251,  442].  They  consist  of  a  restricted  subset  of  solutions  [10].  If 
one  formulates  the  n-queen  problem  as  a  CSP,  backtracking  can  be  used  to  search 
for  any  general  solution.  In  practice,  backtracking  search  is  too  slow  to  solve  the  n- 
queen  problem  for  n  larger  than  96  [498].  Thus  local  search  algorithms  for  solving 
large  size  n-queen  problems  become  a  breakthrough  point  in  this  direction.  Fol¬ 
lowing  recent  work  for  solving  large  scale  n-queen  problems,  Selman,  Levesque  and 
Mitchell  reported  empirical  results  of  GSAT,  a  greedy  local  search  algorithm  for 
solving  SAT  [469].  Selman  [468]  has  recently  acknowledged  that  local  search  solu¬ 
tions  to  large-size  n-queen  problems  was  “the  original  impetus”  to  the  development 
of  the  GSAT  algorithm  [469]. 

7.6.  Randomized  Local  Search.  In  this  section,  we  describe  the  basic  struc¬ 
ture  and  major  components  of  the  randomized  local  search  algorithms  for  the  SAT 
problem. 

Model.  Most  discrete  local  search  procedures  were  developed  based  on  a 
discrete,  unconstrained  optimization  model,  the  SATl  model  [207,  211,  212,  220]. 
In  this  model,  the  truth  values  assigned  to  the  variables  are  defined  as: 

_  J  1  if  the  variable  has  value  true 
^  ‘  ^  ”  I  —  1  if  the  variable  has  value  false 

The  objective  function,  F'(x),  in  the  SATl  model  counts  the  number  of  unsatisfied 
clauses  as  its  objective  value.  A  CNF  is  true  if  and  only  if  F{x)  takes  the  global 
minimum  value  0  on  the  corresponding  x. 

Basic  Local  Search.  The  SATLO  algorithm  for  the  SAT  problem  is  showm  in 
Figure  14.  It  consists  of  an  initialization  stage  and  a  search  stage.  At  the  beginning 
of  search,  a  SAT  formula  is  generated.  An  initial  random  solution  is  chosen.  The 
number  of  unsatisfiable  clauses  is  computed  and  is  assigned  as  the  value  of  the 
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procedure  SAT  1.0  () 
begin 

/*  initialization  */ 
get-a-S  AT  JnstanceO ; 

Xo  :=  select>a_randomJnitiaLpoint(); 

F{xo)  :=  evaluate_objective-function(xo); 

/*  search  */ 

fc  :=  0; 

while  F{xk)  ^  0  do 
begin 

for  each  variable  z  :=  1  to  n  do 

/*  if  flip(xi5^)  does  not  increase  F  */ 

if  test  jlip(xi,^)  then 

begin 

Xfc+i  :=  perforinJlip(xi,^); 

F{xk^i)  :=  evaluate-objective_function(xfc+i); 

end; 

/*  random  flips  */ 

if  local  then  local-handler (); 

end; 

end; 

Figure  14.  SATl.O:  a  randomized  local  search  procedure  for  the 
SAT  problem  [211,  212,  220].  Random  flips  are  introduced  (1) 
to  disorder  the  sequence  with  which  the  variables  are  selected  for 
local  optimization,  and  (2)  to  perturb  local  search  with  randomized 
downhill,  tunneling,  or  uphill  moves  [220,  212]. 


objective  function.  During  each  iteration,  function  tesLswapQ  performs  a  test  to  see 
if  the  objective  function  would  increase.  If  test^flipQ  returns  true^  a  flip  operation  is 
performed  by  procedure  perform-flip{).  Then  function  evaluate^ohjective-functionQ 
updates  the  objective  function. 

The  procedure  terminates  when  the  objective  function  is  reduced  to  zero,  i.e., 
a  solution  to  the  given  SAT  instance  is  found.  In  practice,  before  the  objective 
function  reduces  to  zero,  the  procedure  may  become  stuck  at  local  minima.  In 
the  SATl.O  algorithm  [211,  220],  a  simple  local  handler  performing  random  flips 
was  used  (Figure  15).  This  combined  the  greedy  local  descent  (reducing  objective 
function)  with  the  random  uphill  moves  (increasing  objective  function),  improving 
SATTs  convergence  performance  effectively.  In  the  SATl  algorithm  family,  one  or 
more  local  handlers  were  implemented  [220,  211,  212].  If  the  algorithms  have 
difficulty  to  proceed,  the  algorithms  will  call  the  local  handlers  and  use  special 
heuristics  (see  Section  7.3)  to  improve  algorithms’  convergence  performance. 

The  random  flips  used  in  the  SATl  algorithms  make  the  order  of  selecting  which 
variable  for  local  examination  (i.e.,  the  /or  loop)  trivial  [220,  211,  212].  One  can 
essentially  select  any  variable  randomly  for  examination  during  any  phase  of  the 
local  search. 
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procedure  Local-handler  () 
begin 

random  select  some  variable  Xf’s: 

Xjt+i  :=  performJElip(xi’s,^’s); 

F{xkJri)  •=  evaluate-objective_function(xA:+i); 

end; 


Figure  15.  A  simple  local  handler  used  in  the  SATl  algorithms 
[211,  212,  220].  Random  flips  or  a  new  random  solution  were 
applied  to  the  algorithm  if  (1)  F  ^  0  {SATl.O  algorithm),  (2)  F  > 
0  {SATl.l  algorithm),  (3)  F{xk+i)  =  F{xk)  (5ATL 5  algorithm), 
and  (4)  F  >  0  and  F{xk^x)  =  F(xfc)  (5ATi. 5  algorithm)  [211, 
212,  220]. 


Random  Flips  (Noise).  If  the  local  search  procedure  becomes  stuck  at  a 
local  minimum,  further  progress  may  be  achieved  by  using  a  noise  perturbation  to 
change  its  location  in  the  search  space.  The  effectiveness  with  which  local  min¬ 
ima  are  handled  significantly  affects  the  performance  of  a  local  search  algorithm. 
Researchers  have  proposed  a  number  of  techniques  such  as  jumping^  climbing,  an¬ 
nealing,  and  indexing  to  handle  local  minima  [213].  In  simulated  annealing,  a 
search  process  occatsionally  moves  up  rather  than  down  in  the  search  space,  with 
large  uphill  moves  being  less  likely  than  small  ones.  The  probability  of  large  uphill 
moves  is  gradually  reduced  as  the  search  progresses. 

A  variety  of  local  handlers  have  been  designed  for  use  in  the  local  search  algo¬ 
rithms  [220,  212].  SAT1,0  [207,  220,  211,  212]  used  a  local  handler  that  may 
randomly  negate  the  truth  values  of  one  or  up  to  n  variables  (a  new  solution  point) 
(Figure  15).  The  basic  idea  is  to  generate  random  exchanges  in  some  current  so¬ 
lution  points  when  the  search  is  stuck  at  a  local  minimum.  The  search  accepts  a 
modified  point  as  a  new  current  solution  not  only  when  the  value  of  the  objective 
function  is  better  but  also  when  it  is  worse  [220,  207,  211,  212]  (Traditional  local 
search  such  as  G5AT  used  the  greedy  local  descent  and  restart  [469]).  This  simple 
local  handler  has  effectively  improved  the  convergence  performance  of  the  5 AT  1.0 
algorithm. 

Tunneling  Heuristic.  A  local  handler  and  its  activating  condition (s)  have 
significant  effect  on  the  performance  (running  time  and  average  running  time)  of 
a  local  search  algorithm  for  the  SAT  problem.  The  conditions  for  activating  local 
handlers  differ  from  algorithm  to  algorithm  (see  Figure  15).  In  5AT1.1  algorithm, 
the  local  handler  is  called  if  the  objective  function  is  not  zero  (an  aggressive  strat¬ 
egy)  [220,  212].  In  5AT1.2  algorithm,  the  local  handler  is  called  if  the  objective 
function  does  not  increase  [220,  212].  In  5AT1.3  algorithm,  the  local  handler  is 
called  if  the  objective  function  does  not  increase  or  the  objective  function  is  greater 
than  zero  for  some  iterations  [220,  212].  In  the  last  two  algorithms,  the  condi¬ 
tion  “objective  function  does  not  increase^^  means  that  the  objective  value  is  either 
reduced  {local  descent)  or  remained  unchanged  {tunneling  heuristic). 
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Instead  of  making  a  random  swing  in  the  vertical  direction  in  the  search  space, 
whenever  a  local  minimum  is  encountered,  one  can  tunnel  through  the  rugged  ter¬ 
rain  structure  in  a  horizontal  direction,  moving  from  one  local  basin  to  another 
local  basin  in  an  attempt  to  locate  a  better  locally  optimal  solution.  A  tunnel 
(see  Figure  16)  can  be  thought  of  as  a  short-cut  passing  through  a  mountain  sep¬ 
arating  points  of  equal  elevation.  Whenever  a  local  minimum  is  encountered,  a 
tunnel  is  made  through  a  mountain  to  a  neighboring  basin  as  long  as  this  does  not 
change/increase  the  objective  function.  Tunneling  can  be  used  to  search  a  region 
with  local  minima  effectively.  The  behavior  of  local  search  with  tunneling  illustrates 
the  fact  that  seemingly  innocuous  changes  in  an  optimization  routine  can  ha\e  a 
surprisingly  large  effect  on  its  performance.  When  tunneling  was  first  implemented 
in  the  SATl  algorithm  in  the  late  eighties,  it  was  proven  to  be  effective  in  solving 
some  SAT  problems. 

Parallel  Local  Search.  Several  parallel  algorithms  and  VLSI  architectures 
have  been  developed  to  accelerate  CSP  and  the  SAT  problems  [224,  225,  212, 
489].  Depending  on  implementations,  there  are  several  ways  of  grouping  variables 
or  clauses  together  in  parallel  so  they  can  be  evaluated  simultaneously.  In  the 
SATl  algorithms,  the  most  frequently  used  part  of  computation  is  the  function 
evaluate-objective^function{) .  It  takes  0(rn/)  time  to  update  the  objective  function. 
The  execution  of  evaluate^ohjective-function  can  be  done  in  a  simple  bit-parallel 
manner  in  0(m)  time  on  a  sequential  computer. 

A  computer  word  has  32  or  64  bits  (such  as  the  DEC  Alpha  machine).  The 
number  of  literals  in  a  clause  of  most  practical  CNF  formulas  is  much  less  than  32, 
In  a  local  search  algorithm,  therefore,  one  can  pack  all  the  literals  in  a  clause  into  the 
bits  of  a  computer  word  and  then  evaluate  all  the  literals  in  one  clause  in  parallel. 
For  m  clauses,  instead  of  0(mZ),  it  will  take  procedure  evaluate^objective  Junction 
0(m)  time  to  evaluate  and  update  the  objective  function.  Occasionally,  a  clause 
may  have  more  than  32  literals,  they  can  be  packed  in  several  computer  words  and 
all  of  them  can  be  evaluated  simultaneously.  This  general  bit-parallel  evaluation 
method  was  implemented  in  the  SATl.T  algorithm  [207,  212]. 
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Complete  Local  Search.  Local  search  algorithms  are  incomplete,  z.e.,  they 
can  find  some  solutions  for  certain  CNF  formulas  and  give  no  answer  if  the  CNF 
formula  is  not  satisfiable.  To  overcome  this  problem,  researchers  developed  com¬ 
plete  local  search  algorithms  to  test  satisfiability  as  well  as  unsatisfiability.  The 
basic  idea  in  the  SATl.ll  and  SAT1.13  algorithms  [207,  212]  was  to  combine  lo¬ 
cal  search  with  a  systematic  search  procedure,  keeping  local  search’s  efficiency  while 
maintaining  search  completeness  by  the  systematic  search  method  [207,  212].  If  at 
a  node  of  the  search  tree  a  solution  point  is  found  unsatisfiable,  then  the  algorithm 
backtracks  and  continues  searching  until  a  solution  is  found  or  unsatisfiability  is 
proven. 

The  SATl.ll  and  SAT  1.1 3  algorithms  were  two  early  experiments  of  complete 
local  search  algorithms  [207,  212].  Probe  order  backtracking  is  a  simplified  version 
of  complete  local  search  [430,  431].  Recently  Crawford  studied  a  complete  local 
search  algorithm  [110].  He  used  weights  assigned  to  clauses  to  help  choose  branch 
variables.  Variables  occurring  in  heavily  weighted  clauses  were  given  precedence. 

7.7.  Randomized  Local  Search  with  Trap  Handling.  Based  on  early 
observation  of  trap  phenomenon  and  the  development  of  a  three-level  search  space 
model  (Section  7.2),  Gu  et  al  developed  a  SAT  1.5  algorithm  with  trap  handling 
ability  [211,  222].  The  SAT1.5  algorithm  can  monitor  and  break  local  minimum 
loops  and  can  handle  multiple  traps  during  the  search.  The  current  version  of  the 
SAT  1.5  algorithm  contains  advanced  data  structures  and  complicated  trap  detec¬ 
tion/handling  methods  [211,  212,  222].  For  the  sake  of  simplicity.  Figure  17  gives 
a  brief  outline  of  the  algorithm. 

The  SAT  1.5  starts  with  an  initial  random  solution  and  a  set  of  limiting  pa¬ 
rameters.  Max.^Time,  for  example,  specifies  the  maximum  number  of  times  allowed 
to  restart  a  new  search.  The  number  of  unsatisfiable  clauses  is  computed  and  is 
assigned  as  the  value  of  the  objective  function.  The  first  while  loop  is  limited  by 
the  Max-Time.  Procedure  complete^flip{)  flips  all  the  variables  that  can  reduce  the 
value  of  the  objective  function.  Evaluate-objective^functioni)  updates  the  objective 
function. 

The  second  while  loop  is  a  randomized  local  search  with  trap  tracking  and  han¬ 
dling.  Trap  detection  facilities  are  installed  several  places  in  the  while  loop  to  record 
trap  statistics.  They  are  essential  to  figure  out  trap  “height,”  “width,”  and  other 
parameters  for  subsequent  decision  making.  A  trap  may  contain  a  global  minimum 
solution  and  it  must  be  searched  with  reasonable  effort.  Leaving  a  trap  too  early 
or  too  late  could  result  in  either  losing  solutions  or  wasting  computing  time.  The 
time  to  jump  out  of  a  trap  is  determined  by  parameter  Max.Trapping. Times. 

When  the  search  algorithm  jumps  out  of  a  trap,  there  are  several  alternatives 
to  pursue.  One  is  to  start  a  new’  search.  In  the  while  loop,  several  randomized 
local  search  procedures  deploying  random  value  and  random  variable  heuristics 
(see  Section  7.3)  are  grouped  together  with  partial  random  selection  heuristics. 
They  together  select  a  variable  for  randomized  local  search  (the  objective  function 
F  may  increase  during  the  search). 

If  a  trap  is  detected,  a  number  of  strategies  can  be  used  to  conduct  a  trap 
search  [212,  222].  In  one  approach  proposed  by  Gu  et  al,  a  sequence  of  random 
flip  operations  is  performed  (see  Figure  17).  The  intensity  of  the  flip  operations 
evolves  from  strong  to  w’eak,  tailored  to  the  “coarse”  as  w’ell  as  “fine”  structures 
in  a  trap.  That  is,  a  variable  is  flipped  in  each  unsat  clause  to  force  it  to  value 
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procedure  SAT1.5  () 
begin 

/*  initialization  */ 

get_a-SATJnstance(); 

xo  :=  select -a_randomJnitiaLpoint(); 

F{xo)  :=  evaluate-objectiveJFanction(xo); 

/*  search  */ 

k  :=  0;  Restart  Jaimes  :=  0; 

while  F  >  0  and  Restart JTimes  <  MaxJTime  do 

begin 

/*  Open  Search:  flip  all  variables  that  reduce  F  */ 

Xk+i  •=  complete-flip (xa;); 

F  :=  evaluate-objective-function(xA:-fi); 

/*  parameters  for  trap  tracking  */ 

Clean-trap_records();  Trapping  JTimes  :=  0; 

/*  Peak  Search:  randomized  local  search  */ 

while  F  >  0  and  Trapping TTimes  <  Max  JT rapping  JTimes  do 

begin 

/*  randomly  select  one  var  for  randomized  local  search  / 

Xi  :=  select-one-var-tO-flip(xjfc+i); 

Xjfc+i  :=  randomizedJocal^earch(xi,^); 

F  :=  evaluate-objective_function(xA;+i); 

/*  Trap  Search  */ 

if  a  trap  is  detected  then 

begin 

Trapping  ST  imes  +  +; 

/*  random  flip  vars  in  conflicting  clauses  */ 
xa+i  :=  strongJlip(xjt-fi); 

F  :=  evaluate-.objectivej‘unction(xA:+i); 

/*  random  flip  a  few  percent  [jpct)  of  variables  */ 

Xjfc+i  :=  gentle-flip (xjt+i, pci); 

F  :=  evaluate-objective-function(xA;-j-i); 

/*  random  flip  a  small  set  of  variables  */ 

Xk+i  :=  weak_flip(xA:+i,5ei); 

F  :=  evaluate-objective_function(xA:+i); 

/*  initialization  for  a  new  trap  */ 

Clean-trap -records  0 ; 

end; 

end; 

if  F  >  0  then 

Xk^i  :=  restart.ajaew-random-point(); 

Restart JTimes  -f  +; 

A:  :=  fc  +  1; 
end; 
end; 

Figure  17.  SATl.5:  a  randomized  local  search  procedure  with 
trap  handling. 
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procedure  GSAT  () 
begin 

for  i  :=  1  to  MAX-TRIES 

T  :=  a  randomly  generated  truth  assignment 
for  j  :=  1  to  MAX-FLIPS 

if  T  satisfies  a  then  return  T 
p  :=  a  propositional  variable  such  that  a  change 
in  its  truth  assignment  gives  the  largest  • 
increase  in  the  total  number  of  clauses 
of  a  that  are  satisfied  by  T 
T  :=T  with  the  truth  assignment  of  p  reversed 
end  for 
end  for 

return  “no  satisfying  assignment  found” 

end 

Figure  18.  GSAT:  a  Greedy  local  search  procedure  for  the  SAT 
problem  [469].  MAX-FLIPS,  MAX-TRIES  are  constants,  and  a 
is  a  set  of  clauses.  During  each  search  step,  GSAT  takes  the  best 
neighbor  that  gives  the  maximum  descent  to  the  objective  function. 

true  (procedure  strong-flip{)) ,  followed  by  a  random  flip  of  a  few  percent  of  the 
variables  (procedure  gentle^flip{)) ,  and  finally,  a  random  flip  of  a  small  number  of 
variables  (procedure  weak^flipQ) .  Additional  facilities  for  hill  climbing,  tabu  search, 
and  variable  locking/unlocking  were  developed.  The  SATL5  algorithm  can  walk 
on  the  rugged  surface  of  a  trap  adaptively. 

The  real  execution  performance  of  the  SATL5  algorithm  (Section  13.2)  suggests 
that  it  is  presently  one  of  the  fcistest  local  search  algorithms  for  the  SAT  problem. 

7.8.  Greedy  Local  Search.  Traditional  local  search  proceeds  by  taking  a 
feasible  solution  point  that  reduces  the  value  of  the  objective  function.  Among  many 
neighboring  solution  points,  local  search  does  not  evaluate  its  neighbors’  relative 
performance  with  respect  to  the  objective  function.  A  greedy  algorithm  selects  the 
best  neighbor  that  yields  the  minimum  value  of  the  objective  function  and  takes 
this  best  neighbor  direction  as  the  descent  direction  of  the  objective  function.  In 
a  real  search  space,  continuous  optimization  algorithms  can  find  the  best  neighbor 
solution  efficiently.  Unconstrained  local  and  global  optimization  algorithms  have 
been  developed  for  solving  the  SAT  problem  (see  [211,  217]  and  Section  8). 

In  the  discrete  search  space,  a  greedy  local  search  algorithm  searches  for  the  best 
neighbor  solution.  This  requires  that  during  each  iteration  the  algorithm  examine 
all  the  possible  moves  and  select  one  with  maximum  descent.  Greedy  local  search 
is  a  special  case  of  the  coordinate  descent  in  the  real  space  [211,  217]. 

Selman  et  al.  proposed  a  greedy  local  search  procedure,  i.e.,  GSAT,  for  the 
SAT  problem  [469].  During  each  search  step,  the  algorithm  evaluates  all  the  moves 
and  selects  the  best  one  that  gives  the  greatest  decrease  in  the  total  number  of 
unsatisfied  clauses.  If  the  algorithm  becomes  stuck  at  a  local  minimum,  GSAT  uses 
side- walk  (a  form  of  tunneling  heuristic)  to  move  aside.  In  GSAT  procedure,  two 
parameters,  MAX-TRIES  and  MAX-FLIPS,  were  used  to  control  the  algorithm’s 
maximum  running  state. 
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Late  eighties  VLSI  researchers  experimented  with  a  large  number  of  practical 
SAT  formulas  with  the  greedy  local  search  and  found  that  greedy  local  search  be¬ 
came  stuck  at  local  minima  much  more  easily.  Accordingly,  Gu  proposed  a  method 
of  combining  local  descent  with  random,  multiphase  search,  and  trap  handling 
heuristics  (see  Section  7.3  and  Section  7.7).  These  ideas  were  used  in  the  subse¬ 
quent  SATl  algorithm  family  design  [220,  207,  211,  212]. 

Recently  Selman  et  al.  used  the  bad  variable  heuristic  and  the  partial  random 
variable  selection  heuristics  (Section  7.3)  in  their  random  walk  heuristic  [472,  471]. 
They  found  that  these  random  heuristics  (such  as  random  flips,  selecting  a  variable 
in  unsat  clause,  and  partial  random  variable  selection)  improved  the  performance 
of  the  G5A  r  algorithm  significantly  [472]. 

7.9.  Tabu  Local  Search.  Mazure,  Sais,  and  Gregoire  proposed  a  tabu  search 
algorithm,  TSAT,  for  satisfiability  problem  [367].  The  basic  idea  behind  the  TSAT 
is  to  avoid  using  randomness  in  local  search  algorithm  design.  TSAT  makes  a 
systematic  use  of  a  tabu  list  of  variables  in  order  to  avoid  recurrent  flips  and  thus 
escape  from  local  minima.  The  tabu  list  is  updated  each  time  a  flip  is  made. 
TSAT  keeps  a  fixed  length-chronologically-ordered  FIFO  list  of  flipped  variables 
and  prevents  any  of  the  variables  in  the  list  from  being  flipped  again  during  a  given 
amount  of  time. 

In  this  study,  Mazure  et  al.  found  that  the  optimal  length  of  the  tabu  list 
is  crucial  to  the  algorithm’s  performance.  They  showed  that,  for  random  3SAT 
instances,  the  optimal  length  of  the  tabu  list  L(n)  for  T5ATis  [367]: 

(7.3)  Lin)  =  0.01875n  -f-  2.8125. 

Furthermore,  they  noted  that  a  slight  departure  from  the  optimal  length  leads  to  a 
corresponding  graceful  degradation  of  the  performance  of  TSAT.  A  more  important 
distance  from  this  optimal  length  leads  to  a  dramatic  performance  degradation. 

7.10.  Local  Search  for  DNF  formulas.  Using  the  well-known  DeMorgan 
laws,  we  can  obtain  an  unconstrained  optimization  model,  the  SAT4  model,  for 

formulas  [207,  217]:  With  5AT4,  a  CVF  formula 

(Xl  12)  (^1  +X2+X4)  (X2  +  X3) 

can  be  transformed  into  a  DNF  formula: 

X1X2  +  X1X2X4  +  X2X3. 

For  the  transformed  formula,  the  objective  is  to  determine  whether  there  exists  an 
assignment  where  all  clauses  are  falsified.  That  is,  to  solve  (4.9) . 

A  number  of  local  search  algorithms  were  developed  for  DNF  formulas.  Except 
for  different  definition  and  evaluation  schemes  in  the  objective  function,  they  have 
similar  structures  as  in  GAF  local  search  algorithms.  In  SATl .4  [207],  one  of  the 
early  DNF  local  search  algorithms,  the  objective  function  is  defined  as  the  number 
of  satisfiable  DNF  terms.  Our  goal  here  is  to  reduce  the  objective  function  to  zero. 
Experimental  results  indicate  that  DNF  local  search  algorithms  are  faster  than 
CNF  local  search  algorithms. 

7.11.  A  Historical  Note.  Early  work  in  constraint  satisfaction,  simulated 
annealing,  and  complexity  study  contributed  to  the  development  of  local  search 
algorithms  for  the  S.4T  problem  (see  Sections  7.3,  7.4,  and  7.5,  and  Figure  19).  A 
special  event  was  the  n-queen  debate  in  ACM  SIGART  Bulletin  during  1990  and 
1992. 
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Figure  19.  Early  development  of  local  search  algorithms  for  SAT 
problem.  There  were  two  major  approaches:  randomized  local 
search  {SATl)  and  greedy  local  search  {GSAT),  The  SATl  was 
the  first  local  search  algorithm  developed  for  the  VLSI  engineering 
and  scheduling  applications.  The  GSAT  algorithm  was  derived 
from  the  early  local  search  algorithms  for  the  n-queen  problem. 


Early  the  SATl  algorithms  were  applied  to  solve  VLSI  circuit  design  problems. 
In  addition,  Gu  and  Sosic  implemented  the  same  local  search  method  for  the  n- 
queen  problems.  Later  they  published  two  short  papers  in  SIGART  Bulletin  about 
their  results  [483,  485].  By  accident,  these  two  papers  triggered  a  debate.  Major 
discussions  centered  around  tw^o  questions  raised  by  the  SIGART  readers. 

First,  Jack  Mostow  mentioned  that  Steve  Minton  at  the  same  time  published  “a 
hilLclimbing  algorithm  very  similar  to  Gu’s”  for  the  n-queen  problem  at  AAAI’QO. 
He  was  interested  to  know  the  original  source  of  the  local  search  algorithm  for 
the  n-queen  problem.  Lewis  Johnson  (SIGART  editor)  reviewed  the  original  local 
search  results  for  the  n-queen  problem  [206]  and  found  that:  ‘Tt  is  now  clear  that 
the  n-queens  problem  is  a  solved  problem;  in  fact,  it  has  been  solved  for  many 
years”  [293]. 

The  second  question  was  about  local  search.  Bo  Bernhaxdsson  showed  (in 
SIGART  Bulletin,  VoL  2,  No.  2,  1991)  that  the  analytical  solutions  for  n-queen 
problem  was  published  in  1969.  In  the  same  issue,  Jun  Gu  wrote  an  article  entitled 
“On  a  General  Framework  for  Large-Scale  Constraint- Based  Optimization.”  He 
explained  that  the  analytical  solutions  to  n-queen  problem  only  offer  a  restricted 
set  of  solutions  which  cannot  solve  a  general  search  problem,  and  the  local  search 
algorithm  for  n-queen  can  be  used  to  solve  general  constraint  satisfaction  problems. 
The  discussions  continued  in  a  number  of  SIGART  Bulletins  [294].  In  August  1991, 
M.  Valtorta  showed  more  analytical  solutions  to  the  n-queen  problem  and  the  Tower 
of  Hanoi  problem  [523].  Many  5/GAjRT readers  sent  emails  to  Jun  Gu.  They  agreed 
that  the  analytical  solutions  are  restricted  but  some  also  believed  that  local  search 
can  only  solve  problems  like  the  n-queen.  The  SAT  problem  is  the  core  of  many  NP- 
complete  problems.  To  give  a  strong  case,  Gu  published  a  short  article  ''Efficient 
Local  Search  for  Very  Large-Scale  Satisfiability  Problem”  [211]  and  discussed  the 
SATl  algorithms  as  examples  of  local  search  to  the  SIGART  readers. 

During  the  two-year  period,  many  researchers  including  Jack  Mostow,  Steve 
Minton,  Bart  Selman,  and  Dennis  Kibler  participated  in  the  various  discussions. 
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8.  Global  Optimization 

Local  search  proceeds  by  taking  any  solution  point  that  decreases  the  value  of 
the  objective  function  as  the  next  solution  point.  Since  there  may  be  many  neigh¬ 
boring  solution  points  and  a  local  search  does  not  take  into  account  its  neighbors’ 
relative  performance  to  the  objective  function,  a  local  search  may  get  stuck  in  a 
local  minimum  or  a  basin.  To  escape  from  such  local  minima,  global  search  strate¬ 
gies  need  to  be  developed.  One  such  strategy  is  the  tunneling  heuristic  discussed 
in  Section  7.6.  Another  strategy  is  to  select  the  best  neighboring  point  that  yields 
the  minimum  value  to  the  objective  function.  When  there  is  no  neighboring  point 
that  lead  to  decrease  in  the  objective  function,  a  direction  is  picked  to  minimize 
the  increase  in  the  objective  function. 

Global  optimization  is  concerned  with  the  characterization  and  computation  of 
global  minima  and  maxima  of  unconstrained  nonlinear  functions  and  constrained 
nonlinear  problems  [162,  163,  266,  402].  Global  optimization  problems  belong  to 
the  class  of  NP-hard  problems. 

The  concept  of  optimization  is  well  rooted  as  a  principle  underlying  the  analysis 
of  many  complex  decision  problems.  When  one  deals  with  a  complex  decision 
problem,  involving  the  selection  of  values  to  a  number  of  interrelated  variables, 
one  should  focus  on  a  single  objective  (or  a  few  objectives)  designed  to  qualify 
performance  and  measure  the  quality  of  the  decision.  The  core  of  the  optimization 
process  is  to  minimize  (or  maximize)  an  objective  function  subject  to  constraints 
imposed  upon  values  of  decision  variables  in  an  instance. 

Most  optimization  algorithms  are  designed  as  an  iterative  refinement  process. 
Typically,  in  seeking  a  vector  that  solves  an  optimization  problem,  a  search  algo¬ 
rithm  selects  an  initial  vector  yo  and  generates  an  improved  vector  yi.  The  process 
is  repeated  to  find  a  better  solution  y2-  Continuing  in  this  fashion,  a  sequence  of 
ever-improving  points  yo,  yi,  yfc,  •••?  is  found  that  approaches  a  solution  point 
y*.  When  it  is  not  possible  to  find  neighboring  points  to  improve,  strategies  are 
applied  to  help  escape  from  local  minima. 

There  are  three  aspects  in  designing  global  search  strategies  to  solve  SAT: 


•  Problem  formulations  and  transformations.  As  discussed  in  Section  4.1, 
there  are  alternative  formulations  of  an  instance  of  SAT,  and  global  search 
strategies  may  need  to  be  tailored  to  the  formulation  used.  In  Section  8.1,  we 
present  the  UniS AT  model  that  transforms  a  SAT  formula  represented  as  an 
instance  of  a  discrete  constrained  decision  problem  in  Boolean  {0, 1}  space 
into  a  continuous  optimization  problem  [207,  210,  217].  In  Section  8.7,  we 
present  strategies  based  on  discrete  Lagrange  multipliers  to  transform  a  SAT 
formula  into  an  instance  of  a  discrete  constrained  optimization  problem  [535, 
537].  Other  more  general  transformations  are  presented  in  Section  11. 

•  Strategies  to  select  a  direction  to  move.  Since  a  search  trajectory  lacks  global 
information  in  a  search  space,  strategies  to  select  a  direction  to  move  are 
either  steepest  descent  or  hill  climbing.  A  steepest-descent  approach  chooses 
the  direction  with  the  maximum  gradient.  A  hill-climbing  approach,  on 
the  other  hand,  chooses  the  first  point  in  the  neighborhood  of  the  current 
point  that  reduces  the  objective  function.  For  large  formulas,  hill-climbing 
methods  are  much  faster  than  steepest  descent  because  they  descend  in  the 
first  direction,  rather  than  the  best  direction,  that  leads  to  improvement. 
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•  Strategies  to  help  escape  from  local  minima.  Many  possible  strategies  have 
been  studied  in  the  past.  These  include  local  handlers  that  use  a  combina¬ 
tion  of  restarts,  backtracking  and  random  swaps  (see  Section  7.3  and  [207, 
220,  211,  225,  209]),  Morris’  “break-out”  strategy  [385],  Wah  and  Shang’s 
Discrete  Lagrangian  Method  (DLM)  [535,  537],  Glover  and  Hansen’s  tabu 
list  [199,  238],  stochastic  methods  such  as  simulated  annealing  (SA)  [306, 
74],  and  genetic  algorithms  (GA)  [252,  374].  In  Section  8.8,  we  examine 
the  effects  of  some  of  these  strategies. 


8.1.  UniSAT:  Universal  SAT  Input  Models.  In  UniSAT  models,  we  ex¬ 
tend  discrete  search  space  x  G  {0, 1}”  into  real  space  y  G  so  that  each  solution 
point  and  the  objective  function  can  be  characterized  quantitatively.  Furthermore, 
by  encoding  the  solution  of  a  SAT  formula  into  the  objective  function,  a  direct 
correspondence  between  the  solutions  of  the  SAT  formula  and  the  global  minimum 
points  of  the  objective  function  can  be  established.  Subsequently,  the  SAT  formula 
is  transformed  into  an  instance  of  an  unconstrained  global  optimization  problem 
on  E”. 

In  UniSAT  models,  using  the  universal  DeMorgan  laws,  all  Boolean  V  and  A 
connectives  in  CNF  formulas  are  transformed  into  x  and  -I-  of  ordinary  addition 
and  multiplication  operations,  respectively.  The  true  value  of  the  CNF  formula  is 
converted  to  the  0  value  of  the  objective  function.  Given  a  CNF  formula  T  from 
{0, 1}^  to  {0, 1}  with  m  clauses  Ci, . . .  ,  Cm,  we  define  a  real  function  /(y)  from 
to  E  that  transforms  the  SAT  into  an  unconstrained  global  optimization  problem: 


where 


/(y) 

yEE^ 


/(y)  = 


A  clause  function  Ci{y)  is  a  product  of  n  literal  functions  qijiyj)  (1  <  i 

n 

(S-3)  Ci  =  QijiVj)’ 


In  the  i7m5ATJ  model  [207,  211,  217] 

{\yj  —  1|  if  literal  xj  is  in  clause  Ci 
\yj  -h  1|  if  literal  Xj  is  in  clause  Ci 
1  if  neither  xj  nor  Xj  is  in  Ci 

and  in  the  C/m5AT7 model  [207,  210,  211,  217]: 

{{yj  —  1)^  if  Xj  is  in  clause  Ci 

(yj  -h  1)^  if  Xj  is  in  clause  Ci 

1  if  neither  xj  nor  Xj  is  in  Ci 

The  correspondence  betw'een  x  and  y  is  defined  as  follows  (for  1  <  i  <  n): 

f  1  if  yi  =  1 

Xi  =  <  0  if  yi  =  -1 

undefined  otherwise 

Clearly,  T  has  value  true  iff  /(y)  =  0  on  the  corresponding  yG  {-1, 1}^. 


Xi  =  <  0 
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The  UniSATS  model  on  real  space  is  a  direct  extension  of  the  discrete  5.4 
model  on  Boolean  space.  A  model  similar  to  UniSATS  was  proposed  independently 
in  the  neural  network  area  [290].  A  significant  difference  between  the  neural  net¬ 
work  model  and  UniSATS  is  their  efficiency  and  practical  applicability.  The  neural 
network  model  can  only  be  handled  by  traditional  nonlinear  programming  methods 
that  are  extremely  slow  [290],  whereas  UniSATS  can  be  easily  solved  in  conjunction 
with  the  local  search  approach  by  simple  discrete  accounting  techniques  [207,  217]. 

The  {7ni5AT  models  transform  SAT  from  a  discrete,  constrained  decision  prob¬ 
lem  into  an  unconstrained  global  optimization  problem  [207,  210,  211,  217].  A 
good  property  of  the  transformation  is  that  UniSAT  models  establish  a  correspon¬ 
dence  between  the  global  minimum  points  of  the  objective  function  and  the  solutions 
of  the  original  SAT  formula.  A  CNF  J  has  value  true  if  and  only  if  /(y)  takes  the 
global  minimum  value  0  on  the  corresponding  solution  y* . 

Following  the  above  formulation,  with  the  UniSATS  and  UniSATl  models,  a 

CNFT 

{xi  V  X2)  A  (f  1  V  12  V  X3) 

is  translated  into 

/(y)  =  |yi  - 11^2  + 11  +  |yi  +  1II2/1  -  1II2/3  - 1| 

and 

/(y)  =  (j/i  - 1)^(2/2  + 1)^  +  (yi  +  i)^(yi  ~  i)^(2/3  - 1)  > 

respectively. 

The  solution  of  the  SAT  formula  corresponds  to  a  set  of  global  minimum  points 
of  the  objective  function.  Finding  a  true  value  of  T  is  equivalent  to  finding  a  false 

value,  i.e.,  0,  of  f(y).  .  ,..r  *  r 

The  translation  of  S.4T  formulas  into  nonlinear  programs  is  quite  different  trom 
the  integer  programming  approach  described  in  the  next  section.  In  the  integer 
programming  approach,  one  views  a  SAT  formula  as  an  instance  of  the  0/1  Integer 
Programming  problem  and  tries  solving  its  Linear  Programming  relaxation  [35, 
257,  258,  284,  301,  299,  545].  If  the  solution  is  non-integer,  one  rounds  off  the 
values  to  the  nearest  integers  and  checks  whether  the  solution  corresponds  to  a 
solution  of  the  original  formula.  If  the  rounded  off  values  do  not  correspond  to  a 
solution,  one  computes  another  solution  of  the  linear  programming  problem. 

8.2.  A  Global  Optimization  Algorithm  for  solving  SAT.  Many  families 
of  unconstrained  global  optimization  algorithms  for  the  UniSAT  problem  have  been 
developed  [207,  210,  217].  SAT6.0,  a  basic  global  optimization  algorithm,  is  shown 
in  Figure  20.  To  start,  procedure  obtain.a-SAT-instance{)  initializes  a  (given  or 
generated)  SAT  instance.  An  objective  function,  /,  is  formulated  according  to  a 
given  UniSAT  model.  The  SAT  formula  thus  becomes  a  minimization  problem  to 
the  objective  function.  To  begin,  procedure  select.anJnitiaLsolutionQ  selects  an 
initial  starting  point  yo  S  E^.  The  corresponding  value  of  the  objective  function, 
f{yo),  is  evaluated  by  function  evaluate-object.functionf) . 

The  optimization  process  is  an  iterative  minimization  to  the  objective  function. 
Function  test.minQ  tests  if  the  value  of  the  objective  function  can  be  minimized.  If 
this  is  true,  the  minimization  operation  is  performed  by  procedure  perform-mini) , 
followed  by  evaluate.object.functionQ  that  updates  the  value  of  the  objective  func¬ 
tion.  Procedures  test-min{),  perform^minQ,  and  evaluate-object-functionf)  are  usu¬ 
ally  performed  together  without  distinction.  Depending  on  the  global  optimization 
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procedure  SAT6.0  () 
begin 

/*  initialization  */ 

obtain-a^  ATinstanceO ; 

yo  :=  seIect_anJnitial-3oIution(); 

Kyo)  *=  evaluate^object  Junction(yo); 

/*  secircli  */ 
k  :=  0; 

while  not(solution-testing())  do 
for  some  yi(k)S  6  yt 
begin 

/*  minimizer  */ 

if  test-jnin(/(yi(fc)3))  then 

begin 

y^+i  :=  performjnin(/(yi(jt)s)); 
y^yjfe-fi)  :=  evaluate-object  JiinctionO; 
end 

if  close_tojsolution()  then  x  :=  approximate(yjk+i); 
end; 

/*  local  handler  */ 
if  local  then  localJiandler(); 
k  :=  +  1; 

end; 
end; 

Figure  20.  SAT6.0;  A  general  global  optimization  algorithm  for 
the  satisfiability  problem. 


strategy,  the  objective  function  can  be  minimized  in  one  or  up  to  n  dimensions. 
Methods  capable  of  optimizing  /  in  one  dimension  include  line  search,  coordinate 
descent,  and  coordinate  Newton’s  methods.  Methods  that  optimize  /  in  more  than 
one  dimensions  include  the  steepest  descent  methods,  multi-dimensional  Newton’s 
methods,  and  many  others. 

As  the  iterative  improvement  progresses,  a  global  minimum  point  may  be  ap¬ 
proached  gradually-  The  closeness  between  the  present  solution  point  and  the  global 
minimum  solution  point  can  be  tested  by  solution-point  testing  or  objective-value 
testing.  Procedure  close-to.solution{)  performs  closeness  testing.  If  the  present 
solution  point  is  sufficiently  close  to  a  global  minimum  point,  procedure  approx- 
imateQ  performs  the  round-off  operation  that  converts  a  solution  point  y  in  real 
space  to  a  solution  point  x  in  Boolean  space  {0, 1}”  which  may  be  a  solution 
of  the  original  SAT  formula.  Procedure  solution Jesting{)  takes  the  solution  gener¬ 
ated  from  procedure  approximate^  and  substitutes  it  into  the  given  CNF  formula 
to  verify  its  correctness. 

In  practice,  the  search  process  could  be  stuck  at  a  locally  optimum  point.  To 
improve  the  convergence  performance  of  the  algorithm,  one  or  more  local  handlers 
may  be  added.  One  effective  local  handler  in  SAT6  is  to  negate  the  truth  values  of 
up  to  n  variables. 
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Procedure  SATlA.h  () 
begin 

/*  initialization*/ 

y  :=  initiaLvector(); 

local  :=  search  :=  0;  limit  :=  ftnlogn; 

/*  search  */ 

while  (/(y)>  1  and  local  <  limit)  do 
begin 

old^f  :=  /(y);  search  :=  search  +  1; 

/*  minimizer  */ 
for  i  :=  1  to  n  do 

minimize  /(y)  with  respect  to  yi] 

/*  local  handler  */ 

if  [f{y)=old.f  or  {search  >  h'  logn  and  /(y)>  1))  then 
begin 

y  :=  initiaLvector(); 
search  :=  0;  local  :=  local  +  1; 
end; 
end; 

if /(y)  <  1  then  y*  :=  round.off{y)  else  y*  :=  enumerate(); 
end; 

Figure  21.  SAT14.5:  A  global  optimization  algorithm  for  the 
UniSAT5  problem. 


Any  existing  unconstrained  global  optimization  methods  can  be  used  to  solve 
the  UniSAT  problems  (see  textbooks  and  literature).  So  far  many  global  opti¬ 
mization  algorithms  have  been  developed  [207,  210,  217].  These  include  the  basic 
algorithms,  steepest  descent  methods,  modified  steepest  descent  methods,  Newton  s 
methods,  quasi-Newton  methods,  descent  methods,  cutting-plane  methods,  conju¬ 
gate  direction  methods,  ellipsoid  methods,  homotopy  methods,  and  linear  program¬ 
ming  methods.  In  each  algorithm  family,  different  approaches  and  heuristics  can  be 
used  to  design  objective  functions,  select  initial  points,  scramble  the  search  space, 
formulate  higher-order  local  handlers,  deflect  descent  directions,  utilize  parallelism, 
and  implement  hardw'are  architectures  to  speed  up  computations. 

8.3.  A  Discrete  Global  Optimization  Algorithm.  Although  nonlinear 
problems  are  intrinsically  more  difficult  to  solve,  an  unconstrained  optimization 
problem  is  conceptually  simple  and  easy  to  handle.  Many  powerful  solution  tech¬ 
niques  have  been  developed  to  solve  unconstrained  optimization  problems,  which 
are  based  primarily  upon  calculus  and  simple  accounting,  rather  than  upon  al¬ 
gebra  and  pivoting,  as  in  the  Simplex  method.  Based  on  a  coordinate  descent 
method  [356],  Gu  has  recently  given  a  simple  algorithm,  the  SATU.o  algorithm 
[217,  216],  for  the  UniSATS  problem  (see  Figure  21).  The  kernel  of  5Ari4.5  is  a 
discrete  minimizer  that  minimizes  objective  function  /  by  the  discrete  coordinate 
descent  method. 

Given  a  function  /  on  the  SATlA.b  algorithm  initially  chooses  a  vector  y 
from  E"  and  then  minimize  function  /  with  respect  to  variables  yj  (1  <  J  <  n)  in 
minimizer  until  /  <  1.  Since  each  variable  yj  appears  in  one  clause  function  a  at 
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most  once,  function  /(y)  can  be  expressed  as 

/(y)  =  Ojlj/j  - 1|  +  + 1|  +  <^3 

for  {I  <  j  ^  where  aj,  6j,  and  dj  are  local  gain  factors  that  are  independent 
of  2/j.  They  can  be  computed  in  0{ln)  time.  Therefore,  /(y)  takes  its  minimum 
value  with  respect  to  yj  at  point  either  yj  =  1  or  yj  =  -1.  Thus,  the  minimizer 
optimizes  function  /  as  follows:  if  Oj  >  bj  then  set  yj  equal  to  1;  otherwise  set  yj 
equal  to  —1. 

In  practice,  before  /  <  1,  the  algorithm  could  be  stuck  at  a  local  minimum 
point.  To  overcome  this  problem,  a  simple  local  handler  is  added.  The  local 
handler  simply  generates  a  new  initial  vector  y  to  start  an  independent  search.  In 
the  SATI4.5  algorithm,  if  the  objective  function  /  can  no  longer  be  reduced  or 
after  6'logn  (6'  is  a  constant,  see  [217,  216])  iterations  of  the  while  loop  /  is  still 
at  least  one,  then  the  local-handler  is  called. 


8.4.  A  Continuous  Global  Optimization  Algorithm.  Based  on  a  contin¬ 
uous  coordinate  descent  method  [356],  Gu,  Huang  and  Du  have  recently  developed 
the  5AT14.7  algorithm  for  solving  UniSAT?  problems  on  [217].  For  the  objec¬ 
tive  function  described  in  the  input  model,  if  only  one  variable,  e.g.,  Xi, 

is  selected  for  optimization,  then 

(8-6)  F{xi)  =  ai{xi  -  1)^  -h  hi{xi  +  1)^  +  Ci 


where  Oi,  6f,  and  ci  are  constants  that  can  be  computed  in  0(m/)  time.  Here,  F{xi) 
can  be  minimized  at: 


(8.7) 


CLi  hi 
Oi  4"  bi 


8.5.  Complete  Global  Optimization  Algorithms.  The  5AT14.5,  5AT14.6, 
and  5AT14.7  algorithms  are  incomplete  algorithms.  In  order  to  achieve  high  com¬ 
puting  efficiency  and  to  make  them  complete  algorithms,  we  combine  in  SAT14.il 
to  5AT14.20  global  optimization  algorithms  with  backtracking/resolution  proce¬ 
dures  [207,  217].  Therefore,  these  algorithms  are  able  to  verify  satisfiability  as 
well  as  unsatisfiability.  Figure  22  gives  a  typical  backtracking  global  optimization 
algorithm. 

For  small  and  medium  size  problems,  backtracking  is  able  to  verify  unsatisfia¬ 
bility  quickly  for  certain  classes  of  formulas  but  is  slow  when  it  comes  to  verifying 
satisfiability,  as  all  possible  resolutions  need  to  be  tried  out  before  concluding  that 
the  inference  relation  holds  or  that  the  input  formula  is  satisfiable.  From  our  ex¬ 
perience,  a  combined  global  optimization  algorithm  with  backtracking/resolution 
procedures  would  perform  well  for  certain  classes  of  satisfiable  and  unsatisfiable 
formulas. 

Recently  some  researchers  investigated  the  number  of  solutions  of  SAT  formu¬ 
las.  Extending  Iwama’s  work  [280],  Dubois  gave  a  combinatorial  formula  com¬ 
puting  the  number  of  solutions  of  a  set  of  any  clauses  [148].  He  and  Carlier  also 
studied  the  mathematical  expectation  of  the  number  of  solutions  for  a  probabilistic 
model  [149].  For  an  incomplete  SAT  algorithm,  the  number  of  solutions  can  have 
a  strong  effect  on  its  computing  efficiency.  For  a  complete  SAT  algorithm,  how¬ 
ever,  the  number  of  search  levels  plays  a  crucial  role.  In  5AT14.il  to  5AT14.20 
algorithms,  the  number  of  solutions  is  an  important  strategy  to  interplay  global 
optimization  and  backtracking/resolution  procedures  [212,  217]. 
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Procedure  SAT14.il  () 
begin 

/*  initialization  */ 

get-a^ATJnstance(); 

xo  :=  select_anJnitial_point(); 

/  :=  evaluate-object_function(xo); 

/*  backtracking  with  global  optimization  */ 
x""  :=  backtracking(xo); 
end; 

Procedure  backtracking(xi) 
begin 

/*  global  optimization  assigns  v  to  Xi  */ 

V  :=  globaLoptimization(); 

Xi  :=  V] 

Vi  :=  Vi  -  M; 

/*  append  variable  Xi  to  the  partial  path  */ 
path[a:i]  :=  i; 

if  path  broken  then  backtracking; 
if  solution  found  then  return  x*; 
else  backtracking(next  Xi); 
end; 

Figure  22.  SAT14.il:  a  complete  global  optimization  algorithm 
with  backtracking. 


8.6.  Continuous  Lagrangian- Based  Constrained  Optimization  Algo¬ 
rithms.  In  previous  subsections,  we  have  discussed  unconstrained  (discrete  or  con¬ 
tinuous)  formulations  of  SAT  problems  based  on  optimizing  a  single  unconstrained 
objective  function.  To  avoid  getting  trapped  in  local  minima,  algorithms  for  solv¬ 
ing  these  problems  must  have  strategies  to  escape  from  local  minima.  Some  of 
these  strategies,  such  as  random  restarts  and  tunneling,  move  the  search  to  a  new 
starting  point  and  start  over.  In  the  process  of  doing  so,  vital  information  obtained 
during  the  descent  to  the  current  local  minimum  may  be  lost.  Other  strategies 
may  rely  on  an  internal  or  an  external  force  to  bring  the  search  trajectory  out  of  a 
local  minimum.  Although  they  work  well  for  continuous  problems,  they  may  have 
difficulty  in  dealing  with  SAT  problems  whose  objective  values  are  integers. 

One  way  to  bring  a  search  out  of  a  local  minimum  is  to  formulate  a  SAT  problem 
as  a  constrained  optimization  problem  as  shown  in  (4.10)  and  (4.16).  By  using  the 
force  provided  by  the  violated  constraints,  the  search  trajectory  can  be  brought  out 
of  a  local  minimum.  One  way  to  implement  this  idea  is  compute  the  sum  of  the 
constraints  weighted  by  penalties  and  to  update  the  penalties  continuously  during 
the  search.  The  difficulties  with  this  approach  lies  in  the  choice  of  the  proper 
penalties.  A  more  systematic  approach  is  to  use  a  Lagrangian  formulation.  In  this 
and  the  next  subsections,  we  show  two  Lagrangian  formulations  of  SAT  problems, 
one  in  the  continuous  space  and  the  other  in  the  discrete  space. 
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As  indicated  in  (4.16),  a  SAT  problem  can  first  be  transformed  into  a  continuous 
constrained  optimization  problem. 

m 

(8.8)  minysE"  F{y)  ='^Ci{y) 

i=l 

subject  to  Ci(y)  =  0  Vi  G  {1, 2, , . .  ,m} 

where  y  =  (yi,2/2,  •  *  ■  and  Ci(y)  is  defined  in  (4.12)  and  (4.13)  and  repeated 
as  follows. 


n 


c<(y)  = 

r  {i-vi? 

if  Xj  in  Ci 

Qi,jiyj)  =  \  y] 

if  Xj  in  Ci 

1  1 

otherwise 

Here,  F(y)  is  a  scalar  differentiable  function  that  takes  the  norm  of  its  argument 
so  that  F(y)  =  0  iflf  Ci(y)  =  0  for  all  i. 

There  are  three  advantages  in  reformulating  the  original  discrete  unconstrained 
problem  into  a  continuous  constrained  problem.  First,  a  continuous  objective  func¬ 
tion  may  smooth  out  local  minima  in  the  discrete  space,  allowing  global/local  search 
methods  to  bypass  these  local  minima  in  the  continuous  space.  Second,  a  continu¬ 
ous  objective  value  can  indicate  how  close  the  constraints  are  being  satisfied,  hence 
providing  additional  guidance  in  leading  to  a  satisfiable  assignment.  Third,  when 
the  search  is  stuck  in  a  local  minimum  and  some  of  the  constraints  are  violated, 
the  violated  constraints  can  provide  a  force  to  lead  the  search  out  of  the  local  min¬ 
imum.  This  is  more  effective  than  restarting  from  a  new  starting  point,  as  local 
information  observed  during  the  search  can  be  preserved. 

Active  research  in  the  past  two  decades  has  produced  a  variety  of  methods 
for  finding  global  solutions  to  nonconvex  nonlinear  optimization  problems  [505, 
266,  163,  239,  402,  374].  In  general,  transformational  and  non-transformational 
methods  are  two  approaches  in  solving  these  problems, 

Non-transformational  approaches  include  discarding  methods,  back-to- feasible- 
region  methods,  and  enumerative  methods.  Discarding  methods  [276,  374]  drop 
solutions  once  they  were  found  to  be  infeasible,  and  back-to- feasible-region  meth¬ 
ods  [297]  attempt  to  maintain  feasibility  by  reflecting  moves  from  boundaries  if 
such  moves  went  off  the  current  feasible  region.  Both  of  these  methods  have  been 
combined  with  global  search  and  do  not  involve  transformation  to  relax  constraints. 
Last,  enumerative  methods  [266]  are  generally  too  expensive  to  apply  except  for 
problems  with  linear  objectives  and  constraints,  and  for  bilinear  programming  prob¬ 
lems  [26]. 

Transformational  approaches,  on  the  other  hand,  convert  a  problem  into  an¬ 
other  form  before  solving  it.  Well  known  methods  include  penalty,  barrier,  and 
Lagrange-multiplier  methods  [356].  Penalty  methods  incorporate  constraints  into 
part  of  the  objective  function  and  require  tuning  penalty  coefficients  either  before 
or  during  the  search.  Barrier  methods  are  similar  except  that  barriers  are  set  up  to 
avoid  solutions  from  going  out  of  feasible  regions.  Both  methods  have  difficulties 
when  they  start  from  an  infeasible  region  and  when  feasible  solutions  are  hard  to 
find.  However,  they  can  be  combined  with  other  methods  to  improve  their  solution 
quality. 
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In  Lagrangian  methods,  Lagrange  variables  are  introduced  to  gradually  resolve 
constraints  through  iterative  updates.  They  are  exact  methods  that  optimize  the 
objective  using  Lagrange  multipliers  to  meet  the  Kuhn-Tucker  conditions  [356]. 
Eq.  (8.8)  can  be  reformulated  using  Lagrange  multipliers  into  the  following  uncon¬ 
strained  problem. 

(8.9)  L{y,  A)  =  F{y)  +  A^c(y)  (Lagrangian  function) 

(8.10) £(y,  A)  =  F(y)  +  l|c(y)||2  +  A^c(y)  (Augmented  Lagrangian  function) 

where  c  =  (ci  (y),  02(2/), .  •  • ,  Cm{y)),  and  is  the  transpose  of  the  set  of  Lagrange 
multipliers.  The  augmented  Lagrangian  formulation  is  often  preferred  because  it 
provides  better  numerical  stability. 

According  to  classical  optimization  theory  [356],  all  the  extrema  of  (8.10), 
whether  local  or  global,  are  roots  of  the  following  sets  of  equations. 

(8.11)  Vy^(y.^)  =  0  and  Vx^yA)  =  0 

These  conditions  are  necessary  to  guarantee  the  (local)  optimality  to  the  solution 
of  (8.8). 

Search  methods  for  solving  (8.10)  can  be  classified  into  local  and  global  al¬ 
gorithms.  Local  minimization  algorithms,  such  as  gradient-descent  and  Newton’s 
methods,  find  local  minima  efficiently  and  work  best  in  uni-modal  problems.  Global 
methods,  in  contrast,  employ  heuristic  strategies  to  look  for  global  minima  and  do 
not  stop  after  finding  a  local  minimum  [403,  505,  356].  Note  that  gradients  and 
Hessians  can  be  used  in  both  local  and  global  methods  [505]. 

Local  search  methods  can  be  used  to  solve  (8.11)  by  forming  a  Lagrangian 
dynamic  system  that  includes  a  set  of  dynamic  equations  to  seek  equilibrium  points 
along  a  gradient  path.  These  equilibrium  points  are  called  saddle-points  of  (8.11), 
which  correspond  to  the  constrained  minima  of  (8.8).  The  Lagrangian  dynamic 
system  of  equations  are  as  follows. 

(8.12)  = -Vy'C(y(^).'^(0)  and  =  VA^(yW>'^(0) 

Optimal  solutions  to  the  continuous  formulation  are  governed  by  the  Saddle 
Point  Theorem  which  states  that  y*  is  a  local  minimum  to  the  original  problem 
defined  in  (8.8)  if  and  only  if  there  exists  A*  such  that  (y*.  A*)  constitutes  a  saddle 
point  of  the  associated  Lagrangian  function  F{y,  A).  Here,  a  saddle-point  (y’^,  A*)  of 
Lagrangian  function  P’(y,  A)  is  defined  as  one  that  satisfies  the  following  condition. 

(8.13)  F{y\\)  <  F{y\y)  <  F{y,y) 

for  all  (y%A)  and  all  (y,  A"*)  sufficiently  close  to  (y^jA^*). 

There  are  four  advantages  in  using  a  Lagrangian  formulation  to  solve  con¬ 
strained  optimization  problems. 

•  Saddle  points  of  (8.11)  can  be  found  by  local  gradient  descent/ascent  meth¬ 
ods  defined  in  (8.12).  The  first  equation  in  (8.12)  has  a  minus  sign  that 
optimizes  the  original  variables  along  a  descending  path,  whereas  the  sec¬ 
ond  equation  optimizes  along  an  ascending  path.  Alternatively,  (8.12)  can 
be  considered  as  a  global  search  algorithm  that  has  a  local-search  component 
based  on  a  descent  algorithm  in  the  original  variable  space.  When  the  search 
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reaches  a  local  minimum,  the  search  is  brought  out  of  the  local  minimum  us¬ 
ing  the  weights  imposed  by  its  Lagrange  multipliers.  This  mechanism  allows 
the  search  to  continue  in  its  present  trajectory  without  any  breaks. 

•  Lagrangian  search  is  similar  to  penalty-based  methods  in  the  sense  that  the 
Lagrange  variables  are  increased  like  penalties  when  constraints  are  violated. 
However,  it  is  more  general  than  penalty-based  methods  because  the  increase 
of  a  Lagrange  variable  is  self-adjusting  and  is  governed  by  the  amount  that 
the  corresponding  constraint  is  violated. 

•  The  search  modeled  by  (8.12)  can  be  started  from  any  starting  point  and 
will  continue  until  a  saddle  point  is  found. 

•  Since  assignments  of  y  where  the  constraints  in  (8.8)  are  satisfied  are  also 
assignments  that  minimize  the  objective,  saddle  points  of  (8.11)  found  by 
solving  (8.12)  correspond  to  satisfiable  assignments  to  the  original  SAT  prob¬ 
lem. 

It  is  important  to  note  out  that  a  Lagrangian  search  modeled  by  (8.12)  is  incom¬ 
plete:  if  it  does  not  find  a  solution  in  a  finite  amount  of  time,  it  does  not  prove 
whether  the  original  SAT  problem  is  satisfiable  or  not.  Hence,  infinite  time  will  be 
required  to  prove  unsatisfiability. 

Unfortunately,  continuous  gradient-based  local  search  methods  for  solving  (8.12) 
are  very  time  consuming.  Our  experience  [81]  indicates  that  continuous  descent 
methods  are  several  orders  of  magnitude  more  complex  than  discrete  descent  meth¬ 
ods.  For  instance,  it  takes  over  one  hour  of  CPU  time  on  a  Sun  SSIO  workstation 
to  solve  a  problem  with  200  variables  and  60  constraints.  Consequently,  continuous 
formulations  are  not  promising  in  solving  large  SAT  problems.  In  the  next  subsec¬ 
tion,  we  extend  continuous  Lagrangian  methods  to  discrete  Lagrangian  methods. 
Surprisingly,  discrete  methods  work  much  better  and  can  solve  some  benchmark 
problems  that  cannot  be  solved  by  other  local/global  search  algorithms. 

8.7.  Discrete  Lagrangian-Based  Constrained  Optimization  Algorithms. 
To  overcome  the  computational  complexity  of  continuous  Lagrangian  methods  while 
preserving  their  benefits,  we  show  in  this  subsection  a  discrete  constrained  formu¬ 
lation  of  a  SAT  problem  and  its  solution  using  a  discrete  Lagrangian  method.  The 
discrete  Lagrangian  method  is  extended  from  the  theory  of  continuous  Lagrangian 
methods. 

Recall  (4.10)  in  Section  4.1  the  following  discrete  constrained  formulation  of  a 
SAT  problem. 

m 

(8.14)  minyg^o.i}"  ^iv)  ='^Uiiy) 

subject  to  Ui{y)=0  V2  G  {1, 2, . . .  ,m}. 

Without  going  into  all  the  details  [535],  the  continuous  Lagrangian  method 
can  be  extended  to  work  on  discrete  problems.  The  discrete  Lagrangian  function 
for  (8.14)  is  defined  as  follows. 

(8.15)  i:(y,A)  =iV(y)+A^i7(y) 

where  y  e  {0, 1}",  ?7(y)  =  iUi{y), Um{y))  e  {0, 1}™,  and  A'^  is  the  transpose 
of  A  =  (Ai,A2,...,Am)  that  denotes  the  Lagrange  multipliers.  (Note  that  Aj  can 
be  continuous  variables.) 
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1.  Set  initial  x  randomly  by  a  fixed  random  seed 

2.  Set  initial  X  to  be  zero 

3.  while  X  is  not  a  solution,  z.e.,  N{x)  >  0 

4.  update  x:  x  < —  x  —  AxL{x,X) 

5.  if  condition  for  updating  A  is  satisfied  then 

6.  update  A:  A  < —  A  4-  c  x  U{x) 

7.  end  if 

8.  end  while 


Figure  23.  Generic  discrete  Lagrangian  algorithm  A  for  solving 
SAT  problems. 


In  a  definition  similar  to  'that  in  (8.13),  a  saddle  point  (y*,A*)  of  L{y,X) 
in  (8.15)  is  defined  as  one  that  satisfies  the  following  condition. 

(8.16)  i(y*,  A)  <  i(y*,  A*)  <  L{y,  A*) 

for  all  A  sufficiently  close  to  A*  and  for  all  y  whose  Hamming  distance  between  y* 
and  y  is  1. 

Similar  to  (8.12),  the  Discrete  Lagrangian  Method  (DLM)  for  solving  SAT  prob- 
lems  can  be  defined  as  a  set  of  difference  equations, 

(8.17)  y^^^  =  y*=  -  AyL(y^A*) 

(8.18)  =  A^  +  [/(y^), 

where  AyL(y,A)  is  the  discrete  gradient  operator  with  respect  to  y  such  that 
Ayl/(y, A)  =  (^1,^25***  ><^n)  €  {-1,0,1}”,  Yyi-i  \Si\  =  1,  and  (y  —  AyL(y,A))  G 
{0, 1}”.  Informally,  Ay  represents  all  the  neighboring  points  of  y. 

8.8.  An  Implementation  of  a  Basic  Discrete  Lagrangian  Algorithm. 
Figure  23  shows  the  pseudo  code  of  A,  a  generic  discrete  Lagrangian  algorithm 
implementing  (8.17)  and  (8.18).  It  performs  descents  in  the  original  variable  space 
of  y  and  ascents  in  the  Lagrange-multiplier  space  of  A.  In  discrete  space,  AyL(y,  A) 
is  used  in  place  of  the  gradient  function  in  continuous  space.  We  call  one  iteration 
as  one  pass  through  the  while  loop. 

In  the  following,  w’e  describe  some  of  the  considerations  in  implementing  DLM 

A. 

(a)  Initial  Points  and  Restarts  (Lines  1-2).  DLM  is  started  from  either  the 
origin  or  from  a  random  initial  point  generated  by  calling  drand4S()  using  a  fixed 
random  seed  101.  Further,  A  is  always  set  to  zero.  The  fixed  initial  points  allow 
the  results  to  be  reproducible  easily. 

(b)  Descent  and  Ascent  Strategies  (Line  4).  There  are  two  ways  to  calculate 
AyL(y,  A):  greedy  and  hill-climbing,  each  involving  a  search  in  the  range  of  Ham¬ 
ming  distance  one  from  the  current  y  (assignments  with  one  variable  flipped  from 
the  current  assignment  y). 

In  a  greedy  strategy,  the  assignment  leading  to  the  maximum  decrease  in  the 
Lagrangian-function  value  is  selected  to  update  the  current  assignment.  Therefore, 
all  assignments  in  the  vicinity  need  to  be  searched  every  time,  leading  to  computa¬ 
tion  complexity  of  0(m) ,  where  m  is  the  number  of  variables  in  the  SAT  problem.  In 
hill- climbing,  the  first  assignment  leading  to  a  decrease  in  the  Lagrangian-function 
value  is  selected  to  update  the  current  assignment.  Depending  on  the  order  of  search 
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and  the  number  of  assignments  that  can  be  improved,  hill-climbing  strategies  are 
generally  less  computationally  expensive  than  greedy  strategies.  • 

A  comparison  of  the  two  strategies  show  that  hill-climbing  is  orders  of  magni¬ 
tude  faster  and  results  in  solutions  of  comparable  quality. 

(c)  Conditions  for  updating  A  (Line  5).  The  frequency  in  which  A  is  updated 
affects  the  performance  of  a  search.  The  considerations  here  are  different  from  those 
of  continuous  problems.  In  a  discrete  problem,  descents  based  on  discrete  gradients 
usually  make  small  changes  in  L(y,  A)  in  each  update  of  y  because  only  one  variable 
changes.  Hence,  A  should  not  be  updated  in  each  iteration  of  the  search  to  avoid 
biasing  the  search  in  the  Lagrange-multiplier  space  of  A  over  the  original  variable 
space  of  y. 

Experimental  results  show  that  A  should  be  updated  only  when  /S.xL{x,  A)  =  0. 
At  this  point,  a  local  minimum  in  the  original  variable  space  is  reached,  and  the 
search  can  only  escape  from  it  by  updating  A.  This  strategy  amounts  to  pure 
descents  in  the  original  y  variable  space,  while  holding  A  constant,  until  a  local 
minimum  is  reached. 

Note  that  this  strategy  is  similar  to  Morris’  “break  out”  strategy  [385]  and 
Selman  and  Kautz’s  GSAT  [470,  471]  that  applies  adaptive  penalties  to  escape 
from  local  minima.  One  problem  that  is  overlooked  in  these  strategies  is  the  growth 
of  penalty  terms.  In  solving  a  difficult  SAT  problem,  penalty  terms  may  grow  to 
become  very  large  as  the  search  progresses,  causing  large  swings  in  the  objective 
function  and  delaying  convergence  of  the  search.  Solutions  to  this  issue  are  discussed 
next. 

(d)  Amount  of  update  of  X  (Line  6).  A  parameter  c  controls  the  magnitude  of 
changes  in  A.  In  general,  c  can  be  a  vector  of  real  numbers,  allowing  non-uniform 
updates  of  A  across  different  dimensions  and  possibly  across  time.  For  simplicity, 
c  =  1  has  been  found  to  work  well  for  most  of  the  benchmarks  tested.  However, 
for  some  larger  and  more  difficult  problems,  a  smaller  c  can  result  in  shorter  search 
time. 

The  update  rule  in  Line  6  results  in  nondecreasing  A.  This  is  true  because  U{x) 
is  either  0  or  1:  when  a  clause  is  not  satisfied,  its  corresponding  A  is  increased;  and 
when  a  clause  is  satisfied,  its  corresponding  A  is  not  changed.  In  contrast,  in  ap¬ 
plying  Lagrangian  methods  to  solve  continuous  problems  with  equality  constraints, 
the  Lagrange  multiplier  Xi  of  constraint  gi{x)  =  0  increases  when  gi{x)  >  0  and 
decreases  when  g{x)  <  0. 

When  there  is  no  mechanism  to  reduce  the  Lagrange  multipliers,  they  can  grow 
without  bound,  causing  large  swings  in  the  Lagrangian-function  value  and  making 
the  search  terrain  more  rugged.  Although  this  strategy  does  not  worsen  the  search 
time  for  most  of  the  benchmark  problems  tested,  A  values  can  become  very  large  as 
time  goes  on  for  a  few  difficult  problems  requiring  millions  of  iterations.  When  this 
happens,  the  search  has  difficulty  in  identifying  an  appropriate  direction  to  move. 

This  situation  is  illustrated  in  the  first  two  graphs  of  Figure  24  that  show  the 
behavior  of  DLM  when  it  was  applied  to  solve  one  of  the  more  difficult  DIMACS 
SAT  benchmark  problems.  Here,  the  search  is  stuck  in  a  sub-optimal  basin  in  the 
space  of  the  objective  function  where  the  number  of  unsatisfied  clauses  fluctuates 
around  20.  Since  the  search  terrain  modeled  by  L  becomes  more  rugged  as  the 
Lagrange  multipliers  increase,  the  search  will  have  difficulty  to  escape  from  this 
region. 
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50,  and  periodic  scaling  of  A  by  a  factor  of 


1.5  every  10,000  iterations 


Figure  24.  Execution  profiles  of  ^^gl25-17,^'  one  of  the  difficult 
DIM  ACS  benchmark  problem.  Figures  (a),  (c),  and  (e)  plot  the 
Lagrangian-function  values  and  the  number  of  unsatisfied  clauses 
versus  the  number  of  iterations.  Figures  (b),  (d),  and  (f)  plot  the 
minimum,  average  and  maximum  values  of  Lagrange  multipliers. 
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To  overcome  this  problem,  A  should  be  reduced  periodically.  For  instance,  in 
the  last  two  graphs  of  Figure  24,  A  was  scaled  down  by  a  factor  1.5  every  10,000 
iterations.  This  strategy,  when  combined  with  other  strategies  to  be  discussed  next, 
restricts  the  grown  of  Lagrange  multipliers,  leading  to  the  solution  of  some  of  the 
more  difficult  benchmark  problems. 

(e)  Plateaus  in  the  Search  Space.  In  binary  problems  like  SAT,  a  search  may 
find  a  very  small  subset  of  variables  that  can  lead  to  no  degradation  in  the  objective 
function.  Flipping  variables  in  this  small  subset  successively  may  likely,  lead  to  a 
cycle  in  the  search  space.  To  avoid  such  an  undesirable  situation,  variables  that 
have  been  flipped  in  the  recent  past  can  be  stored  in  a  tabu  list  [199,  238]  and  will 
not  be  flipped  until  they  are  out  of  the  list. 

Further,  for  large  SAT  problems  formulated  as  discrete  optimization  problems, 
the  search  may  encounter  large  and  flat,  but  suboptimal,  basins.  Here,  gradients 
in  all  directions  are  the  same  and  the  search  may  wander  forever.  The  discrete 
gradient  operator  AyL(y,A)  may  have  difficulties  in  basins /plateaus  because  it 
only  examines  adjacent  points  of  L(y,A)  that  differ  in  one  dimension.  Hence,  it 
may  not  be  able  to  distinguish  a  plateau  from  a  local  minimum. 

One  way  to  escape  is  to  allow  uphill  moves.  For  instance,  in  GSATs  random 
walk  strategy  [472],  uphill  walks  are  allowed  based  on  probability  p.  However,  the 
chance  of  getting  a  sequence  of  uphill  moves  to  get  out  a  deep  basin  is  small  since 
each  walk  is  independent. 

There  are  two  effective  strategies  that  allow  a  plateau  to  be  searched. 

(a)  Flat-move  strategy.  We  need  to  determine  the  time  to  change  A  when  the 
search  reaches  a  plateau.  As  indicated  earlier,  updating  A  when  the  search  is  in 
a  plateau  changes  the  surface  of  the  plateau  and  may  make  it  more  difficult  for 
the  search  to  find  a  local  minimum  somewhere  inside  the  plateau.  To  avoid  this 
situation,  a  strategy  called  flat  move  [535]  can  be  employed.  This  allows  the  search 
to  continue  for  some  time  in  the  plateau  without  changing  A,  so  that  the  search 
can  traverse  states  with  the  same  Lagrangian-function  value.  How  long  should 
flat  moves  be  allowed  is  heuristic  and  possibly  problem  dependent.  Note  that  this 
strategy  is  similar  to  Selman’s  “sideway-move”  strategy  [471]. 

(b)  Tabu  list  This  search  strategy  aims  to  avoid  revisiting  the  same  set  of 
states  in  a  plateau.  In  general,  it  is  impractical  to  remember  every  state  the  search 
visits  in  a  plateau  due  to  the  large  storage  and  computational  overheads.  A  tabu 
list  [199,  238]  can  be  kept  to  maintain  the  set  of  variables  flipped  in  the  recent 
past  and  to  avoid  flipping  a  variable  if  it  is  in  the  tabu  list. 

The  last  four  graphs  of  Figure  24  illustrate  the  performance  of  DLM  when 
the  search  maintains  a  tabu  list  of  size  30,  when  it  is  allowed  to  stay  in  a  basin 
within  50  moves  (flat-move  limit),  and  when  all  Lagrange  multipliers  are  peri¬ 
odically  scaled  down.  These  graphs  show  significant  reduction  in  the  growth  of 
Lagrangian-function  values  and  Lagrange  multipliers. 

By  using  these  strategies,  DLM  can  solve  successfully  many  of  the  hard  prob¬ 
lems  in  the  DIM  ACS  benchmark  suite  [535].  These  results  are  presented  in  Sec¬ 
tion  13. 


8.9.  Convergence  Property  and  Average  Time  Complexity.  Gu,  Gu 
and  Du  [227]  have  analyzed  the  convergence  ratios  of  three  basic  methods:  the 
steepest  descent  method,  Newton’s  method,  and  the  coordinate  descent  method  for 
objective  function  /  defined  in  the  UniSAT? input  model.  They  prove  that,  subject 
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to  certain  conditions  [356],  the  steepest  descent  method  has  a  linear  convergence 
ratio  [(.4  —  a) /{A  +  a)]^  <  1,  Newton’s  method  has  a  convergence  ratio  of  order 

two,  and  the  coordinate  descent  method  has  a  convergence  ratio  ^1  - 

where  ^  >  a  >  0  are  the  largest  and  smallest  eigenvalues  of  the  Hessian  matrix 

H(y),  respectively. 

From  these  convergence  properties,  Gu,  Gu,  and  Du  roughly  estimate  that,  sub¬ 
ject  to  certain  conditions  [356],  the  UniSAT? problem  can  be  solved  in  C»(log(n  4- 
m))  iterations  by  the  steepest  descent  method  and  can  be  solved  in  0{m  log(n-l-m)) 
iterations  by  the  coordinate  descent  method,  on  the  assumption  that  the  algorithm 
is  not  stuck  at  a  local  minimum  point. 

Gu  and  Gu  have  made  some  preliminary  analysis  of  the  typical  time  complexity 
of  some  global  optimization  SAT  algorithms  [216].  It  shows  that,  the  S/1T14.5 
algorithm,  with  probability  at  least  finds  a  solution  within  k  =  (3(n(lognp 

iterations  of  the  while  loop  for  a  randomly  generated  satisfiable  CNF  formula  with 
I  >  3  and  min  <  a2’‘  Jl,  where  a  <  Ms  a  constant.  From  this  and  the  fact  that 
the  run  time  of  procedure  enumerateQ  is  0(2”),  the  typical  time  complexity  of  the 
SATli.o  algorithm  is 

(1  -  e“”)0(n{logn)^(/mn))  +  e“"0(2")  =  0{ln{n\ognf). 

Clearly,  the  5AT14.5  algorithm  can  give  an  answer  to  an  unsatisfiable  OATF  for¬ 
mula  in  0(2")  time. 


9.  Integer  Programming  Method 

In  this  section,  we  first  give  an  integer  program  (IP)  formulation  of  SAT.  Then 
we  describe  some  traditional  techniques  of  using  the  integer  programming  approach 
to  solve  SAT. 


9.1.  An  Integer  Programming  Formulation  for  SAT.  In  order  to  repre¬ 
sent  SAT  inputs  in  the  framework  of  mathematical  programming,  we  identify  logic 
value  true  with  integer  1  and  false  with  —1.  Similar  in  UniSAT models  (Section  8.1), 
all  Boolean  V  and  A  connectives  are  transformed  into  -I-  and  x  of  ordinary  addition 
and  multiplication  operations,  respectively.  Using  a  standard  transformation,  the 
ith  clause  Ci  is  transformed  into  a  linear  inequality  [301,  299]; 


(9.1)  >  2  -  IGi 

i=i 


where 

(9.2) 


w  if  literal  Xj  is  in  clause  Ci 

— ty  if  literal  Xj  is  in  clause  Ci 
0  if  neither  Xj  nor  xj  is  in  Ci 


where  wj  is  the  jth  integer  variable. 

To  restrict  wj  =  ±1,  j  =  requires  that  extra  constraints  be  added 

to  insure  that  each  Wj  be  in  the  closed  interval  [-1,1],  «•£.,  -1  <  wj  <  1  for 
j  ~  1,  2,  ...71. 

Following  the  above  formulation,  for  example,  a  CNF  T 


{xi  V  xo)  A  (xi  V  X2  V  X3)  A  (x2  V  X3) 


is  translated  into 
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Wi  —  W2  >  0 
—  Wi  +W2+  Ws  >  “1 
^2  +  ^3  >  0 

SO  an  integer  programming  formulation  is  obtained  for  SAT  as:  finding  wj  =  ±1, 
such  that 

(9.3) 
or 

(9.4) 

While  the  Simplex  method  is  effective  for  solving  linear  programs  (LP),  there  is 
no  single  technique  that  is  fast  for  solving  integer  programs.  Therefore,  approaches 
developed  try  to  solve  the  integer  program  as  an  integer  linear  program  (ILP)."^  If 
the  solution  is  non-integer,  one  rounds  off  the  values  to  the  nearest  integers  and 
checks  whether  this  corresponds  to  a  solution  of  the  original  input.  If  the  rounded 
off  values  do  not  correspond  to  a  solution,  adds  a  new  constraint  and  computes  a 
solution  of  the  modified  linear  program.  So  far  most  methods  developed  to  solve 
the  integer  programs  for  SAT  indirectly  work  on  the  corresponding  integer  linear 
programs. 

Researchers  have  observed  that  the  optimal  integer-programming  solution  is 
usually  not  obtained  by  rounding  the  linear-programming  solution  although  this  is 
possible  in  certain  cases  (see  Section  10).  The  closest  point  to  the  optimal  linear- 
program  may  not  even  be  feasible.  In  some  cases,  the  nearest  feasible  integer  point 
to  the  linear-program  solution  is  far  removed  from  the  optimal  integer  point.  Thus, 
when  using  an  integer  linear  program  to  solve  the  integer  program  for  SAT,  it  is 
not  sufficient  simply  to  round  linear-programming  solutions. 

In  the  following  sections,  we  describe  existing  integer  programming  methods  to 
solve  SAT. 

9.2.  Linear  Program  Relaxation.  A  basic  method  to  solve  an  integer  pro¬ 
gram  is  the  linear  program  relaxation.  In  this  approach,  the  LP  relaxation  is 
achieved  by  replacing  xi  E  {0,1}  with  0  <  Xi  <  1.  The  LP  relaxation  can  be  solved 
efficiently  with  some  sophisticated  implementations  of  Dantzig’s  Simplex  method, 
such  as  MINOS  [386],  or  some  variations  of  Karmarkar’s  interior  point  method 
[303]. 

Hooker  early  reported  that  by  solving  a  linear  programming  of  SAT,  one  fre¬ 
quently  produces  an  integer  solution  [258].  Kamath  et  ai  used  MINOS  5.1  to 
solve  linear  programming  relaxation  [301,  299],  They  tried  some  small  SAT  in¬ 
puts  and  found  that  the  Simplex  method  failed  to  find  integral  solutions  to  the 
linear  programming  relaxations  in  majority  of  instances  tested. 


^An  integer  linear  program  (ILP)  is  a  linear  program  further  constrained  by  integrality 
restrictions. 
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Figure  25.  An  example  of  a  branch-and-bound  tree. 


9.3.  Branch  and  Bound  Method.  Branch-and-bound  is  essentially  a  strat¬ 
egy  of  “divide  and  conquer.”  It  is  a  straightforward  and  the  most  successful  way 
to  solve  the  integer  programming  problem.  The  idea  is  to  systematically  partition 
the  linear-programming  feasible  region  into  manageable  subdivisions  and  make  as¬ 
sessments  of  the  integer-programming  problem  based  on  these  subdivisions.  When 
moving  from  a  region  to  one  of  its  subdivisions,  we  add  one  constraint  that  is  not 
satisfied  by  the  optimal  linear-programming  solution  over  the  parent  region.  So  the 
linear  programs  corresponding  to  the  subdivisions  can  be  solved  efficiently.  In  gen¬ 
eral,  there  are  a  number  of  ways  to  divide  the  feasible  region,  and  as  a  consequence 
there  are  a  number  of  branch-and-bound  algorithms. 

We  show  the  basic  procedures  of  branch-and-bound  with  a  simple  example 
shown  in  Figure  25.  The  method  starts  with  the  fractional  solution  given  by  its 
corresponding  LP  relaxation.  Then  a  variable  of  fractional  solution  is  selected.  For 
example,  let  xi  be  a  variable,  and  set  xi  <  0  as  an  additional  constant;  i.e.,  branch 
on  xi  wdth  constraint  <  0.  Resolve  the  LP  relaxation  with  this  augmented  con¬ 
straint.  If  it  still  produces  a  non-integer  solution,  branch  on  another  non-integer 
variable,  say  X2,  first  with  constraint  xo  <  0,  and  resolve  the  LP  with  extra  con¬ 
straint  xi  <  0  and  X2  <  0.  This  process  continues  until  solving  the  augmented  LP 
yields  an  integer  solution,  i.e.,  an  incumbent  solution,  so  there  is  no  need  to  branch 
further  at  that  node.  Since  we  do  not  know  this  to  be  optimal,  a  backtracking  pro¬ 
cedure  is  required  to  search  with  extra  constraints  xi  <  0  and  xo  >  0  and  resolve 
the  augmented  LP  and  continue  the  process  until  an  integer  solution  is  obtained. 

The  above  process  produces  a  binary  tree  as  shown  in  Figure  25.  In  this  way, 
we  implicitly  exhaust  all  possibilities  and  conclude  with  an  optimal  solution.  Note 
that  each  time  we  obtain  an  incumbent  solution  we  get  a  new  upper  bound  on  the 
minimum  value  of  the  objective  function.  It  at  the  same  node  the  LP  yields  an 
objective  function  with  value  that  exceeds  the  best  upper  bound  obtained  so  far, 
then  we  can  fathom  that  node,  since  any  solution  obtained  at  its  successors  can 
only  be  worse. 

9.4.  Cutting-Plane  Method.  Unlike  partitioning  the  feasible  region  into 
subdivisions,  as  in  branch-and-bound  approaches,  the  cutting-plane  algorithm  solves 
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Figure  26.  An  illustration  of  the  cutting^plane  method. 


integer  programs  by  modifying  linear-programming  solutions  until  an  integer  solu¬ 
tion  is  obtained.  It  works  with  a  single  linear  program,  which  it  refines  by  adding 
new  constraints.  The  new  constraints  successively  reduce  the  feasible  region  until 
an  integer  optimal  solution  is  found. 

The  idea  of  the  cutting  plane  method  can  be  illustrated  from  a  simple  geomet¬ 
ric  interpretation  (Figure  26).  The  feasible  region  for  the  integer  program,  i.e.,  an 
integer  polytope,  consists  of  those  integer  lattice  points  satisfying  all  constraints.  A 
cut  is  an  inequality  satisfied  by  all  the  feasible  solutions  of  the  integer  program.  A 
cutting  plane  is  a  hyperplane  defined  by  that  inequality  and  it  conflicts  with  the  so¬ 
lution  JA*  of  the  linear-programming  relaxation.  The  cutting  plane  passes  between 
X*  and  the  integer  polytope  and  cuts  off  a  part  of  the  relaxed  polytope  containing 
the  optimal  linear-programming  solution  X*  without  excluding  any  feasible  integer 
points.  After  the  cut,  the  resulting  linear  program  is  solved  again.  If  the  optimal 
values  for  the  decision  variables  in  the  linear  program  are  all  integer,  they  are  op¬ 
timal;  otherwise,  a  new  cut  is  derived  from  the  new  optimal  linear-programming 
tableau  and  appended  to  the  constraints. 

In  practice,  the  branch-and-bound  procedures  almost  always  outperform  the 
cutting-plane  algorithm.  Nevertheless,  the  algorithm  has  been  important  to  the 
evolution  of  integer  programming.  Historically,  it  was  the  first  algorithm  developed 
for  integer  programming  that  could  be  proven  to  converge  in  a  finite  number  of 
steps.  In  addition,  even  though  the  algorithm  generally  is  considered  very  inef¬ 
ficient,  it  has  provided  insights  into  integer  programming  that  have  led  to  other, 
more  efficient  algorithms. 

9.5.  Interior  Point  Method.  The  most  important  advance  in  linear  pro 
gramming  solution  techniques  was  recently  introduced  by  Karmarkar  [303].  As 
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^2  optimal 


Figure  27.  The  basic  ideas  of  the  Simplex  method  and  Kar- 
markar’s  method. 

shown  in  Figure  27,  compared  to  the  Simplex  method  which  jumps  from  a  cor¬ 
ner  point  to  another  corner  point  of  the  LP  polytope  until  the  optimal  solution 
is  found,  Karmarkar’s  algorithm  constructs  an  ellipsoid  inside  the  polytope  and 
uses  nonlinear  transformations  to  project  better  solution  guesses  in  the  interior  of 
the  polytope.  Unlike  the  Simplex  method  which  approaches  the  optimal  solution 
indeed  by  step-by-step  searching  and  has  an  exponential  worst-case  complexity, 
Karmarkar’s  algorithm  has  been  proven  to  be  a  polynomial  time  algorithm.^ 

To  apply  Karmarkar’s  algorithm  on  integer  programming,  first  the  0/1  intepr 
program  is  transformed  to  a  ±1  integer  program.  Then  the  potential  function 
is  used,  and  obviously  the  optimal  integer  solution  to  the  original  IP  problem 
is  at  the  point  that  the  potential  function  achieves  a  maximum.  However using 
Karmarkar’s  algorithm  on  integer  programming  may  get  stuck  at  a  local  minimum, 
i.e.,  it  does  not  guarantee  to  find  the  optimal  solution  by  projection.  Therefore,  it 
is  an  incomplete  algorithm. 

9.6.  Improved  Interior  Point  IVIethod.  It  is  expected  that  a  sequence  of 
interior  points 

(9.5)  =w^  ‘ha  Aw* 

is  generated  such  that  the  potential  function  in  Karmarkar’s  algorithm  is  minimized. 
It  is  crucial  to  determine  the  descent  direction  Aw*  of  the  potential  function  around 
w^  and  the  step  size  a. 

In  the  original  Karmarkar’s  algorithm,  the  step  size  a  is  assumed  with  (0,1J. 
They  used  a  =  0.5  in  their  experiments  to  solve  SAT  inputs.  If  the  potential 
function  is  well  represented  by  the  quadratic  approximation  around  the  given  point, 
then  if  we  move  along  the  Newton  direction  and  have  the  appropriate  values  for 
certain  parameters,  we  will  reach  the  minimum;  otherwise,  recall  that  the  step  size 
is  chosen  so  that  it  reaches  a  minimum  of  the  objective  function  on  that  line  of  the 
given  descent  direction.  So  there  is  no  reason  to  restrict  a  within  (0,1]. 
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This  suggests  the  necessity  to  use  line  search  to  choose  optimal  step  size.  Fol¬ 
lowing  this  idea,  Shi,  Vannelli,  and  Vlach  have  recently  given  an  improved  interior 
point  algorithm  [476].  In  their  algorithm,  the  step  size  a  is  determined  by  a  golden- 
section  search  [356].  Experiments  show  significant  improvements  on  Karmarkar's 
algorithm. 


10.  Special  Subclasses  of  SAT 

Certain  subclasses  of  SAT  that  are  known  to  be  solved  in  polynomial  time  have 
been  identified  and  explored.  There  are  at  least  three  reasons  for  discussing  such 
subclasses  in  this  section.  First,  a  given  formula  can  be  preprocessed  and  examined 
to  determine  whether  it  is  a  member  of  a  polynomial-time  solvable  subclass  of  SAT. 
If  so,  a  special,  fast  algorithm  can  be  brought  to  bear  on  the  formula.  Second,  a 
portion  of  a  given  formula  may  a  member  of  such  a  subclass  and  its  solution  may 
make  solving  the  given  formula  easier.  Third,  study  of  such  subclasses  reveals, 
in  part,  the  nature  of  “easy”  SAT  formulas.  On  the  other  hand,  as  reported  in 
Section  12,  studies  of  random  formulas  indicate  that  these  known  classes  contain 
only  a  small  fraction  of  the  formulas  that  can  be  solved  rapidly. 

Below,  we  consider  some  of  the  more  notable  polynomial-time  subclasses.  When 
we  say  apply  unit  resolution  we  mean  apply  the  unit  clause  rule  to  exhaustion. 

10.1.  2-SAT.  A  CATF  formula  containing  clauses  of  one  or  two  literals  only  is 
solved  in  linear  time  by  applying  unit  resolution  [18,  155]. 

10.2.  Horn  and  Extended  Horn  Formulas.  A  CNF  formula  is  Horn  if 
every  clause  in  it  has  at  most  one  positive  literal.  This  class  is  widely  studied,  in 
part  because  of  its  close  association  with  Logic  Programming.  Horn  formulas  can 
be  solved  in  linear  time  using  unit  resolution  [144,  278,  466]. 

The  class  of  extended  Horn  formulas  was  introduced  by  Chandru  and  Hooker  [79] 
who  were  looking  for  conditions  under  which  a  Linear  Programming  relaxation 
could  be  used  to  find  solutions  to  propositional  formulas.  A  theorem  of  Chan- 
drasekaran  [84]  characterizes  sets  of  linear  inequalities  for  which  0-1  solutions  can 
always  be  found  (if  one  exists)  by  rounding  a  real  solution  obtained  using  an  LP 
relaxation.  Extended  Horn  formulas  can  be  expressed  as  linear  inequalities  that  be¬ 
long  to  this  family  of  0-1  problems.  The  following  graph-theoretic  characterization, 
taken  from  [501],  is  simpler  than  the  LP  characterization. 

Let  C  be  a  clause  constructed  from  a  variable  set  V,  and  let  be  a  rooted 
directed  tree  with  root  s  (Le.,  a  directed  tree  with  all  edges  directed  away  from 
s)  and  with  edges  uniquely  labeled  with  variables  in  V.  Then  C  is  extended  Horn 
w.r.t  R  if  the  positive  literals  of  C  label  a  (possibly  empty)  dipath  P  of  R,  and  the 
set  of  negative  literals  in  C  label  an  edge-disjoint  union  of  dipaths  Qi,  -m  Qt  of 
R  with  exactly  one  of  the  following  conditions  satisfied: 

1-  Qi,Q2,  start  at  the  root  s. 

2.  Qi,  •••» Oi-ij  (say),  start  at  the  root  s,  and  Qt  and  P  start  at  a  vertex 
g  s  (if  P  is  empty,  Qt  can  start  from  any  vertex). 

A  clause  is  simple  extended  Horn  w.r.t.  R  if  it  is  extended  Horn  w.r.t.  R  and  only 
Condition  1  above  is  satisfied.  A  formula  is  (simple)  extended  Horn  w.r.t.  R 
if  each  of  its  clauses  is  (simple)  extended  Horn  w.r.t.  R.  A  formula  is  [simple) 
extended  Horn  if  it  is  (simple)  extended  Horn  w.r.t.  some  such  rooted  directed  tree 
R. 
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One  tree  R  for  a  given  Horn  formula  is  a  star  (one  root  and  all  leaves  with  an 
edge  for  each  variable  in  the  formula) .  Hence,  the  class  of  extended  Horn  formulas 
is  a  generalization  of  the  class  of  Horn  formulas. 

Chandru  and  Hooker  [79]  showed  that  unit  resolution  alone  can  determine 
whether  or  not  a  given  extended  Horn  formula  is  satisfiable.  A  satisfying  truth  as¬ 
signment  for  a  satisfiable  formula  may  be  found  by  applying  unit  resolution,  setting 
values  of  unassigned  variables  to  1/2  when  no  unit  clauses  remain,  and  rounding  the 
result  by  a  matrix  multiplication  [79].  This  algorithm  cannot,  however,  be  reliably 
applied  unless  it  is  known  that  a- given  formula  is  extended  Horn.  Unfortunately, 
the  problem  of  recognizing  extended  Horn  formulas  is  not  known  to  be  solved  in 
polynomial  time. 

10.3.  Formulas  from  Balanced  (0,=i:l)  Matrices.  The  class  of  formulas 
from  balanced  (0,  ±1)  matrices,  which  we  call  balanced  formulas  here,  has  been 
studied  by  several  researchers  (see  [100]  for  a  detailed  account  of  balanced  matri¬ 
ces  and  a  description  of  balanced  formulas).  The  motivation  for  this  class  is  the 
question,  for  SAT,  when  do  Linear  Programming  relaxations  have  integer  solutions? 

Express  a  CNF  formula  of  m  clauses  and  n  variables  as  an  m  x  n  (0,  ±1)- 
matrix  M  where  the  rows  are  indexed  on  the  clauses,  the  columns  are  indexed  on 
the  variables,  and  a  cell  has  the  value  -{-1  if  clause  i  has  variable  j  as  an 

unnegated  literal,  the  value  -1  if  clause  i  has  variable  j  as  a  negated  literal,  and 
the  value  0  if  clause  i  does  not  have  variable  j  as  a  negated  or  unnegated  literal. 
A  CNF  formula  is  a  balanced  formula  if  in  every  submatrix  of  M  with  exactly  two 
nonzero  entries  per  row  and  per  column,  the  sum  of  the  entries  is  a  multiple  of 
four  [507]. 

Let  a  CNF  formula  be  cast,  in  standard  fashion,  as  a  linear  programming 
problem  of  the  form  {x  :  Mx  >  1  —  n(M),0  <  2:  <  1}  where  n{M)  is  a  column 
vector  whose  components  are  the  number  of  negated  literals  in  clauses  at  the  rows 
corresponding  to  those  components.  If  M  is  balanced,  then  for  every  submatrix  A 
of  M,  the  solution  to  {x  :  Ax  >  1  “*  n(A),0  <  x  <  1}  is  integral  [100].  Froin  this 
it  follows  that  balanced  formulas  may  be  solved  in  polynomial  time  using  linear 
programming. 

Balanced  formulas  have  the  property  that,  if  every  clause  contains  more  than 
one  literal,  then  for  every  variable  v  there  are  two  satisfying  truth  assignments:  one 
with  V  set  to  true  and  one  with  v  set  to  false.  Thus,  the  following  is  a  simple  linear- 
time  algorithm  for  finding  solutions  to  known  balanced  formulas  [100].  Apply  unit 
resolution  to  the  given  formula.  If  a  clause  is  falsified,  the  formula  is  unsatisfiable. 
Otherwise,  repeat  the  following  as  long  as  possible:  choose  a  variable  and  set  its 
value  to  true,  then  apply  unit  resolution.  If  a  clause  becomes  falsified,  then  the 
formula  is  unsatisfiable,  otherwise  all  clauses  have  been  satisfied  by  the  assignment 
resulting  from  the  variable  choices  and  unit  resolution. 

Unlike  extended  Horn  formulas,  balanced  formulas  are  known  to  be  recognized 

in  polynomial  time  [100]. 

10.4.  Single-Lookahead  Unit  Resolution.  This  class  was  developed  as  a 
generalization  of  other  classes  including  Horn,  extended  Horn,  simple  extended 
Horn,  and  balanced  formulas  [463].  It  is  peculiar  in  that  it  is  defined  based  on 
an  algorithm  rather  than  on  properties  of  formulas.  The  algorithm,  called  SLUR, 
selects  variables  sequentially  and  arbitrarily  and  considers  both  possible  values  for 
each  selected  variable.  If,  after  a  value  is  assigned  to  a  variable,  unit  resolution 
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does  not  result  in  a  clause  that  is  falsified,  the  assignment  is  made  permanent  and 
variable  selection  continues.  If  all  clauses  are  satisfied  after  a  value  is  assigned 
to  a  variable  (and  unit  resolution  is  applied),  the  algorithm  returns  a  satisfying 
assignment.  If  unit  resolution,  applied  to  the  given  formula  or  to  both  sub-formulas 
created  from  assigning  values  to  the  selected  variable  on  the  first  iteration,  results 
in  a  clause  that  is  falsified,  the  algorithm  reports  that  the  formula  is  unsatisfiable. 
If  unit  resolution  results  in  falsified  clauses  as  a  consequence  of  both  assignments  of 
values  to  a  selected  variable  on  any  iteration  except  the  first,  the  algorithm  reports 
that  it  has  given  up. 

A  formula  is  in  the  class  SLUR  if,  for  all  possible  sequences  of  selected  variables, 
SLUR  does  not  give  up  on  that  formula.  SLUR  takes  linear  time  with  the  modi¬ 
fication,  due  to  Truemper  [510],  that  unit  resolution  be  applied  simultaneously  to 
both  branches  of  a  selected  variable,  abandoning  one  branch  if  the  other  finishes 
first  without  falsifying  a  clause.  Note  that  due  to  the  definition  of  this  class,  the 
question  of  class  recognition  is  avoided. 

All  Horn,  extended  Horn,  and  balanced  formulas  are  in  the  class  SLUR.  Thus, 
an  important  outcome  of  the  results  on  SLUR  is  the  observation  that  no  special 
preprocessing  or  testing  is  needed  for  a  variety  of  special  subclasses  of  SAT  when 
using  a  reasonable  variant  of  the  DPL  algorithm. 

A  limitation  of  all  the  classes  above  is  that  they  do  not  represent  many  inter¬ 
esting  unsatisfiable  formulas.  There  are  several  possible  extensions  to  SLUR  which 
improve  the  situation.  One  is  to  add  a  2-SAT  solver  to  the  unit  resolution  step. 
This  extension  is  at  least  able  to  handle  all  2-SAT  formulas  which  is  something 
SLUR  cannot  do.  This  extension  can  be  elegantly  incorporated  into  SLUR  due 
to  an  observation  of  Truemper:  “Whenever  SLUR  completes  a  sequence  of  unit 
resolutions,  and  if  at  that  time  the  remaining  clauses  are  nothing  but  a  subset  of 
the  original  clauses  (w’hich  they  would  have  to  be  if  all  clauses  have  at  most  two 
literals),  then  effectively  the  SLUR  algorithm  can  start  all  over.  That  is,  if  fixing 
of  a  variable  to  both  values  leads  to  an  empty  clause,  then  the  formula  has  been 
proved  to  be  unsatisfiable.  Thus,  one  need  not  augment  SLUR  by  the  2-SAT  algo¬ 
rithm,  because  the  2-SAT  algorithm  (at  least  one  version  of  it)  does  exactly  what 
the  modified  SLUR  does.”  Another  extension  of  SLUR  is  to  allow  a  polynomial 
number  of  backtracks,  giving  up  if  at  least  one  branch  of  the  DPL  tree  does  not 
terminate  at  a  leaf  where  a  clause  is  falsified.  Thus,  unsatisfiable  formulas  with 
short  DPL  trees  can  be  solved.  However,  such  formulas  are  uncommon. 


10.5.  q-Horn  Formulas.  This  class  of  propositional  formulas  was  developed 
by  Boros,  Grama,  Hammer,  Saks,  and  Sun  in  [44]  and  [43].  We  choose  to  charac¬ 
terize  the  class  of  q-Horn  formulas  as  a  special  case  of  monotone  decomposition  of 
matrices  [508,  510].  As  in  the  case  of  balanced  (0,±1)  matrices,  express  a  CNF 
formula  of  m  clauses  and  n  variables  as  an  m  x  n  (0,  =bl)-matrix  M  where  the  rows 
are  indexed  on  the  clauses,  the  columns  are  indexed  on  the  variables,  and  a  cell 
M(i,  j)  has  the  value  -fl  if  clause  i  has  variable  j  as  an  unnegated  literal,  the  value 
—  1  if  clause  i  has  variable  j  as  a  negated  literal,  and  the  value  0  if  clause  i  does  not 
have  variable  j  as  a  negated  or  unnegated  literal.  In  the  monotone  decomposition 
of  A/,  columns  are  scaled  by  -1  and  the  rows  and  columns  are  partitioned  into 
submatrices  as  follows: 
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where  the  submatrix  A'-  has  at  most  one  +1  entry  per  row,  the  submatrix  D 
contains  only  -1  or  0  entries,  the  submatrix  has  no  restrictions  other  than  the 
three  values  of  -1,  +1,  and  0  for  each  entry,  and  the  submatrLx  E  has  only  0  entries. 
If  the  monotone  decomposition  of  M  is  such  that  has  no  more  than  two  nonzero 
entries  per  row,  then  the  formula  represented  by  M  is  q-Horn. 

A  recent  result  by  Truemper  [510]  can  be  used  to  find  a  monotone  decomposi¬ 
tion  for  a  matrix  associated  with  a  q-Horn  formula  in  linear  time.  Once  a  q-Horn 
formula  is  in  its  decomposed  form  it  can  be  solved  in  linear  time  as  follows.  Treat 
submatrix  as  a  Horn  formula  and  solve  it  in  linear  time  using  a  method  such 
as  in  [144,  278,  466]  which  returns  a  minimum,  unique  truth  assignment  for  the 
formula  with  respect  to  true.  If  the  Horn  formula  is  unsatisfiable  then  the  q-Horn 
formula  is  unsatisfiable.  Otherwise,  the  returned  assignment  satisfies  A^  and  some 
or  all  rows  of  D.  The  set  of  true  variables  in  every  truth  assignment  satisfying 
A'-  contains  the  set  of  variables  true  in  the  returned  minimum,  unique  truth  as¬ 
signment.  Therefore,  since  elements  of  D  are  either  0  or  —1,  no  truth  assignment 
satisfying  A^  can  satisfy  any  rows  of  D  that  are  not  satisfied  by  the  minimum, 
unique  truth  2issignment.  Hence,  the  only  way  A^  and  D  both  can  be  satisfied  is 
if  minus  the  rows  collinear  with  those  of  D  that  are  satisfied  by  the  minimum, 
unique  truth  assignment,  can  be  satisfied.  Since  A^  represents  a  2-SAT  formula, 
any  subset  is  also  2-SAT  and  can  be  solved  in  linear  time.  If  the  answer  is  unsat¬ 
isfiable  then  the  q-Horn  formula  is  unsatisfiable;  if  the  answer  is  satisfiable  then 
such  a  satisfying  assignment  plus  the  minimum,  unique  truth  assignment  returned 
earlier  are  a  solution  to  the  q-Horn  formula. 

The  developers  of  the  class  q-Horn  also  offer  a  linear-time  solution  to  formulas 
in  this  class.  The  main  result  of  [43]  is  that  a  q-Horn  formula  can  be  recognized  in 
Unear  time.  See  [42]  for  a  linear-time  algorithm  for  solving  q-Horn  formulas. 

Formulas  in  the  class  q-Horn  are  thought  to  be  close  to  what  might  be  regarded 
as  the  largest  easily  definable  class  of  polynomially  solvable  propositional  formulas 
because  of  a  result  due  to  Boros,  Crama,  Hammer,  and  Saks  [44].  Let  {vi ,  vo, ...,  v„} 
be  a  set  of  Boolean  variables,  and  P*  and  Nk,  P*  =  0  be  subsets  of  {1, 2, ...,  n} 
such  that  the  fcth  clause  in  a  CAP  formula  is  given  by  VjgjVfc  Vi.  Construct 

the  following  system  of  inequalities: 

+  (A:=:l,2,...,m),  and 

i€Pk  i€Nk 

0  <  Oi  <  1,  (f  =  1,2,  ...,n). 

where  Z  €  R^.  If  all  these  constraints  are  satisfied  with  Z  <  1  then  the  formula 
is  q-Horn.  On  the  other  hand,  the  class  of  formulas  such  that  the  minimum  Z 
required  to  satisfy  these  constraints  is  at  least  1  -f-  l/n%  for  any  fixed  e  <  1,  is 
NP-complete.  For  more  information  on  the  subject  of  q-Horn  formulas  will  appear 
in  [510]. 

10.6.  Renamable  Formulas.  Suppose  clauses  of  a  CNF  formula  T  are  con¬ 
structed  from  a  set  V  of  variables  and  let  V  C  V.  Define  switch{E,V')  to  be  the 
formula  obtained  as  follows:  for  every  v  €  V ,  reverse  all  unnegated  occurrences  of 
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V  in  to  negated  occurrences  and  all  negated  occurrences  of  v  to  unnegated  occur¬ 
rences.  For  a  given  formula  JF,  if  there  exists  a  V"'  C  V"  such  that  switch{Ty  V')  is 
Horn,  extended  Horn,  etc.,  then  the  formula  is  said  to  be  renamable  Horn,  extended 
Horn,  etc.,  respectively. 

The  algorithms  given  above  work  even  if  a  given  formula  is  renamable  to  a 
formula  in  the  class  for  which  they  apply.  Additional  classic  references  to  Horn 
renamability  are  [337]  and  [19]. 

It  is  interesting  to  note  that  there  exist  formulas  in  the  class  of  SLUR,  formulas 
that  are  not  members  of  either  renamable  extended  Horn  formulas  or  balanced 
formulas  [463]. 

10.7.  Formula  Hierarchies.  Some  sets  of  clauses  not  falling  into  one  of  the 
polynomially  solvable  classes  defined  above  may  be  reduced  to  “equivalent”  for¬ 
mulas  that  are  members  of  at  least  one  of  these  classes.  If  such  reductions  are 
efficient,  these  sets  can  be  solved  in  polynomial  time.  Such  reductions  can  take 
place  in  stages  where  each  stage  represents  a  class  of  polynomially  solved  formulas, 
and  lower  stages  represent  classes  of  perhaps  lower  time  complexity  than  classes 
represented  by  higher  stages.  The  lowest  stage  is  a  polynomially  solved  base  class, 
such  as  one  of  the  classes  above. 

An  example  of  such  a  hierarchy  is  found  in  [182].  The  base  class,  at  stage 
0,  is  Horn.  Consider  a  stage  1  formula  that  is  not  Horn.  By  definition  of  the 
hierarchy,  there  is  a  variable  v  which,  if  set  to  true^  leaves  a  set  of  non-satisfied 
clauses  and  non-falsified  literals  that  is  Horn.  If  this  Horn  formula  is  found  to  be 
satisfiable,  we  can  conclude  the  original  formula  is.  Otherwise,  setting  v  to  false 
leaves  a  set  of  clauses  that  is  a  stage  1  formula  (empty  formulas  are  considered 
to  belong  to  every  stage).  Thus,  the  above  process  can  be  repeated  (on  stage  1 
formulas)  to  exhaustion.  Since  it  takes  linear  time  to  solve  Horn  formulas  and  in 
the  worst-case  a  linear  number  of  Horn  systems  must  be  considered,  the  process  for 
solving  formulas  at  stage  1  has  quadratic  complexity.  The  above  concept  can  be 
expanded  to  higher  stages  to  form  a  hierarchy:  at  stage  i,  when  setting  v  to  true,  a 
sub-formula  is  at  stage  i  -  1,  and  when  setting  v  to  false,  a  sub-formula  is  at  stage 
i.  Thus,  solutions  to  stage  i  formulas  are  carried  out  recursively  leading  to  a  time 
complexity  that  is  bounded  by  m\  An  alternative  way  to  solve  formulas  at  stage  i 
in  the  hierarchy  is  to  use  i-resolution  (resolution  is  not  applied  unless  at  least  one 
clause  has  at  most  i  literals)  [62]. 

The  only  remaining  question  is  to  determine  whether  a  given  formula  is  a  stage 
i  formula.  This  can  be  done  with  a  bottom- up  approach  described  in  [182]. 

For  other  information  on  such  Hierarchies  see,  for  example,  [113,  184]. 

10.8.  Pure  Implication  Formulas.  Pure  implication  formulas  are  defined 
recursively  as  follows: 

1.  A  variable  is  a  pure  implication  formula. 

2.  If  !Fi  and  JF2  are  pure  implication  formulas  then  {Ti  ->  To)  is  a  pure  impli¬ 
cation  formula. 

Eliminating  parentheses  on  right  to  left  associativity,  a  pure  implication  formula 
can  be  written  Ti  T2  ^  Tp  ^  z  where  z  is  a  variable.  We  call  the  z 

variable  of  a  formula  the  right- end  variable. 
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The  satisfiability  problem  is  trivial  for  a  pure  implication  formula  but  the  prob¬ 
lem  of  falsifiability  is  NP-complete  even  if  all  variables  except  the  right-end  variable 
occur  at  most  twice  in  the  formula.  Furthermore,  the  complexity  of  determining 
falsifiability  seems  to  increase  at  least  exponentially  with  the  number  of  occurrences 
of  the  right-end  variable  [248];  this  yields  a  hierarchy  of  classes  starting  from  linear¬ 
time  solvability  and  going  through  NP-completeness.  This  is  possibly  due  to  the 
fact  that  the  expressive  power  of  pure  implication  formulas  at  the  lower  levels  of 
the  hierarchy  is  extremely  limited.  Despite  this  lack  of  expressibility,  it  seems  that 
the  lower  levels  of  the  hierarchy  are  incomparable  with  other  special  polynomial¬ 
time-solvable  classes  such  as  2-SAT  and  SLUR.  To  make  this  more  concrete,  define 
a  class  of  CiVF  formulas  related  to  pure  implication  formulas  and  call  it  PICNF(A:). 

A  formula  in  the  class  PICNF(fc)  consists  only  of  the  following  kinds  of  clause 
groups: 

1:  V  X;i'2  V  A  (3?;ri  ^  ^7r2)  A  V 

2:  ip^Tci  V  Xtx2  ^  ^TTa)  A  (xttj  V  A  (2:77^  V  2:773) 

3:  (2:771) 

where  the  number  of  type  2  groups  is  fixed  at  k  and  each  variable  occurs  at  most 
twice  in  a  PICNF(A:)  formula.  The  falsifiability  question  for  a  given  pure  implica¬ 
tion  formula  with  right-end  variable  occurring  at  most  k  times  is  identical  to  the 
satisfiability  question  for  a  formula  in  class  PICNF(A:).  If  all  but  one  totally  negated 
clauses  are  removed  from  such  a  formula,  a  complete  set  of  at  most  n  partial  truth 
assignments,  each  of  which  can  be  extended  to  satisfying  truth  assignments,  can 
be  constructed  in  linear  time.  Doing  this  for  each  totally  negated  clause  results 
in  k  such  sets  of  partial  truth  assignments.  Multiplying  these  sets  to  find  consis¬ 
tent  assignments  spanning  all  k  sets  can  determine  whether  the  given  formula  is 
satisfiable.  This  can  be  accomplished  in  O(n^)  time,  matching  the  complexity  of 
falsifiability  of  pure  implication  formulas.  A  recent  result  [172]  shows  this  can  be 
reduced  to  0{k^'n?)  time.  We  remark  that  the  problem  of  determining  satisfiability 
for  formulas  of  the  union  of  the  classes  PICNF(A:),  for  all  k,  is  NP-complete. 

The  class  PICNF(A:),  k  fixed,  is  incomparable  to  other  polynomially  solved 
classes  discussed  above.  For  example,  there  are  SLUR  CNF  formulas  that  are  not 
represented  as  PICNF(A:)  formulas  and  vice  versa  (particularly  many  unsatisfiable 
PICNF(fc)  formulas  are  not  SLUR  CNF  formulas).  Also,  although  it  is  easy  to 
construct  a  PICNF(A:)  formula  that  is  renamable  Horn  (and  therefore  q-Horn),  even 
the  PICNF(l)  set  (xi Vx2Vx3)A(a:i  Vx2)A(a:i  Vx3)A(xi Vx3Vx4)A(2:iVx3)A(xi V2:4) 
is  not  q-Horn. 

PICNF(A:)  is  interesting  because,  for  k  fixed,  it  contains  formulas  that  are  not 
in  other  polynomial-time  solvable  classes  and  the  severe  lack  of  expressibility  of 
PICNF(A:)  formulas  may  be  exploited  to  assist  complexity  investigations  of  class 
hierarchies.  In  particular,  why  should  the  hierarchies  discussed  above  have  0{\F\ ') 
complexity  when  a  complexity  of  0(2* |F|),  say,  is  not  inconsistent  with  any  devel¬ 
oped  theory?  PICNF(A:)  may  be  useful  in  answering  this  question. 

10.9.  Non-linear  Formulations.  An  optimization  problem  with  0-1  vari¬ 
ables  can  be  reduced  to  a  constrained  nonlinear  0-1  program.  Such  programs  are 
expressed  as  follows: 


(10.1) 


max 

x6{0,l}^ 


F(x)  =  ^c*T, 


*=1 
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subject  to 

Pi 

Qi  (x)  ^ik^ik  ^  bi]  Z  =  1,2, ,..,771 

ife=l 


where 

(10.2)  7fe=  n  NkCN  =  {l,2,...,n},  k  =  l,2,...,p 
and 

T*jJt  =  jj^  j  ^ik  ^  ^  —  1  j  2,  .  .  .  ,  Pi,  i  —  1,  2,  .  .  .  ,  771 

jeNik 

Problems  in  propositional  logic,  originating,  for  example,  in  graph  theory,  can  be 
expressed  this  way  by  associating  0  to  false,  1  to  true,  1  -*  a  to  a,  and  a  •  6  to  a  A  6. 

Several  methods  for  solving  the  above  formulation  have  been  proposed  [234, 
236].  In  some  restricted  cases  these  methods  have  polynomial  time  complexity. 
Thus,  the  rich  literature  on  this  subject  can  be  carried  over  to  the  domain  of 
propositional  satisfiability  to  provide  low  complexity  algorithms  for  SAT  under 
corresponding  conditions. 

A  notable  example  involves  functions  for  which  the  co-occurrence  graph  is  a 
partial  Jfc-tree  [107].  The  DNF  formulation  expressed  by  equations  (4.7)-(4.8)  is 
in  the  form  of  equations  (10.1)  and  (10,2).  Let  F{x)  be  such  a  DNF  function. 
The  co-occurrence  graph  of  F(x)  has  a  vertex  set  corresponding  to  the  variables 
•••  with  an  edge  between  Xi  and  xj  {i  7^  j)  if  these  variables  occur 
simultaneously  in  at  least  one  product  term  of  F(x).  A  simple,  undirected  graph  G 
is  a  A:-tree  if  there  is  an  ordering  ■  •  •  ?  }  of  ils  vertices  such  that,  for 

all  j  =  1, 2, . . .  ,  77  —  fc,  in  the  subgraph  Gj  induced  by  vertices  {xt^j  ,  » •  •  •  >  } 

the  vertex  Xt^^  has  degree  k  and  its  neighbors  induce  a  complete  subgraph  of  Gj. 
A  partial  fc-tree  is  any  graph  obtained  by  deleting  edges  from  a  A:-tree.  If  the  co¬ 
occurrence  graph  of  F{x)  is  a  partial  A;-tree,  then  F{x)  can  be  solved  in  linear 
time  [107],  Since  the  maximization  problem  for  DNF  formulas  is  the  same  as  the 
minimization  problem  for  CiVF  formulas  (by  using  1  —  x  for  literal  x  and  x  for  literal 
x),  CATF  formulas  can  be  solved  in  linear  time  if  their  corresponding  co-occurrence 
graph  is  a  partial  Ar-tree. 

Another  example  is  a  linear  time  algorithm  for  determining  whether  a  2-SAT 
formula  has  exactly  one  solution,  that  is,  uniquely  solvable.  The  question  of  deter¬ 
mining  unique  solvability  is  a  tough  one  in  general  and  it  is  even  hard  to  determine 
whether  linear  time  algorithms  exist  for  special  subclasses  of  SAT®.  However,  one 
is  presented  for  2-SAT  in  [235]  using  the  framework  of  pseudo-boolean  functions 
(that  is,  of  the  form  (10.1)  and  (10.2)).  Finally,  we  mention  the  result  of  [106] 
where  a  polynomial  time  algorithm  for  producing  a  parametric  representation  of  all 
solutions  to  a  2-SAT  formula  is  presented. 


10.10.  Nested  and  Extended  Nested  Satisfiability.  The  complexity  of 
nested  satisfiability  has  been  studied  in  [312].  That  study  was  inspired  by  Lichten¬ 
stein’s  theorem  of  planar  satisfiability  [345].  Index  all  variables  in  a  C7iVF  formula. 
A  clause  Ci  straddles  a  clause  C2  if  the  index  of  a  literal  of  Co  is  strictly  between 


®An  almost  linear  algorithm  for  unique  Horn-SAT  has  been  obtained  by  Berman  et  al.  [29] 
and  improved  into  a  linear  time  algorithm  by  a  slight  modification  due  to  Pretolani  [418]  (Minoux 
developed  a  quadratic  time  algorithm  in  [375]) 


ALGORITHMS  FOR  THE  SATISFIABILITY  (SAT)  PROBLEM:  A  SURVEY 


75 


two  indices  of  literals  of  Ci-  Two  clauses  overlap  if  they  straddle  each  other.  A  for¬ 
mula  is  nested  if  no  two  clauses  overlap.  The  problem  of  determining  satisfiability 
for  nested  formulas,  the  clauses  ordered  so  that  clause  Ci  does  not  straddle  clause 
Cj  when  i  <j,  can  be  solved  in  linear  time  [312]. 

An  extension  to  nested  satisfiability  has  been  proposed  in  [237].  We  prefer  to 
skip  the  details  and  just  mention  that  this  extension  can  be  recognized  and  solved 
in  linear  time.  For  details,  the  reader  is  referred  to  [237]. 

11.  Advanced  Techniques 

In  this  section,  we  describe  a  number  of  advanced  optimization  techniques  for 
satisfiability  testing.  They  have  been  used  in  practical  engineering  applications  and 
have  proven  to  be  effective  for  certain  classes  of  SAT. 

11.1.  General  Boolean  Representations.  In  practice,  many  problems  in 
integrated  circuit  design,  such  as  logic  verification,  test  pattern  generation,  asyn¬ 
chronous  circuit  design,  logic  optimization,  sequential  machine  reduction,  and  sym¬ 
bolic  simulation,  can  be  expressed  as  Boolean  satisfiability  problems  with  arbitrary 
Boolean  functions.  Each  representation  has  corresponding  algorithms  for  satisfia¬ 
bility  testing.  A  Boolean  representation  affects  the  performance  of  Boolean  ma¬ 
nipulation  methods  accordingly.  Thus,  efficient  representation  and  manipulation 
of  Boolean  functions  is  crucial  to  many  practical  applications.  Many  different  rep¬ 
resentations  have  been  proposed  for  manipulating  Boolean  functions.  However, 
many  Boolean  functions  derived  from  practical  circuit  design  problems  suffer  from 
an  exponential  size  in  their  representations,  making  satisfiability  testing  infeasible. 

Most  SAT  algorithms  work  on  conjunctive  normal  form  (CNF)  formulas,  i.e., 
input  formulas  must  be  expressed  as  a  product  of  sums  of  literals.  The  CNF  formula 
is  a  canonical  formula  used  in  most  analytical  studies  but  is  not  an  efficient  represen¬ 
tation  in  practical  application  problems.  Many  real  engineering  design  problems  use 
non-clausal  representations  rather  than  the  CNF  formula.  Algorithms  in  this  cat¬ 
egory  may  be  regarded  as  non-clausal  inference  algorithms  for  satisfiability  testing 
[218].  Compared  to  CiVF  formulas,  a  non-clausal,  general  Boolean  representation 
is  much  more  compact  and  efficient,  although  the  transformation  of  an  arbitrary 
non-clausal  expression  into  CNF  can  be  done  in  polynomial  time  by  introducing 
new  variables.  This  will  result  in  clause-form  representation  of  substantially  larger 
sizes  [192,  412].  While  this  is  not  critical  in  complexity  theory,  it  will  have  serious 
impact  in  solving  practical  application  problems. 

In  practice,  a  SAT  algorithm  can  be  made  much  more  efficient  if  it  works 
directly  on  problems  represented  in  a  compact  number  of  general  Boolean  formulas 
rather  than  a  large  collection  of  CNF  cXzmses.  For  a  non-clausal  SAT  algorithrn,  the 
evaluation  of  arbitrarily  large,  complex  Boolean  functions  is  a  key  to  its  efficiency 
[226], 

The  next  two  subsections  describe  a  sequential  and  a  parallel  Boolean  repre¬ 
sentation  and  manipulation  methods. 

11.2.  Binary  Decision  Diagram  (BDD).  Ordered  Binary  Decision  Dia¬ 
grams  (OBDDs)  [59,  60]  is  an  efficient  representation  and  manipulation  method  for 
arbitrary  Boolean  functions.  This  representation  is  defined  by  imposing  restrictions 
on  the  Binary-Decision-Diagram  (BDD)  representation  introduced  by  Lee  [3p]  and 
Akers  [9],  such  that  the  resulting  form  is  canonical.  The  OBDD  representation  and 
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(a+b) . (a+c) 

F 


Figure  28.  A  simple  BDD  example  for  F  =  (a  +  6)  •  (a  +  c). 


its  manipulation  method  are  an  extremely  powerful  technique  in  various  practical 
applications.  It  is  particularly  useful  with  formulas  where  one  needs  to  consider  ev¬ 
ery  solution,  such  as  cases  where  one  must  search  for  optimal  solutions.  Although 
the  OBDD  representation  of  a  function  may  have  size  exponential  in  the  number 
of  variables,  many  useful  functions  have  more  compact  representations  in  practice. 

A  BDD  gives  a  graphical  representation  of  Boolean  functions.  It  is  a  directed 
acyclic  graph  with  two  types  of  leaf  nodes,  0  and  1.  Each  non-leaf  node  is  labeled 
with  a  Boolean  variable  v  and  has  two  out-going  edges  labeled  0  (the  left  edge)  and 
1  (the  right  edge).  A  BDD  can  be  utilized  to  determine  the  output  value  of  the 
function  by  examining  the  input  values.  Every  path  in  a  BDD  is  unique,  i.e.,  no 
path  contains  nodes  with  the  same  variables.  This  means  that  if  w'e  arbitrarily  trace 
out  a  path  from  the  function  node  to  the  leaf  node  1,  then  we  have  automatically 
found  a  value  assignment  to  function  variables  for  which  function  will  be  1  regardless 
of  the  values  of  the  other  variables. 

Given  a  simple  example  Boolean  function  F  =  (a  +  6)  •  (a  -h  c),  the  BDD  of 
function  F  can  be  constructed  to  determine  its  binary  value,  given  the  binary  values 
of  variables  a,  6,  and  c.  At  the  root  node  of  BDD,  we  begin  at  the  value  of  variable 
a.  If  a  =  1,  then  F  =  1  and  we  are  finished.  If  a  =  0,  we  look  at  b.  If  b  —  0, 
then  F  =  0  and  again  we  are  finished.  Otherwise,  we  look  at  c,  its  value  will  be 
the  value  of  F.  The  complete  BDD  for  function  F  is  shown  in  Figure  28,  w'here  all 
the  paths  from  the  root  function  node  F  to  the  leaf  node  1  are  highlighted.  Each 
highlighted  path  yields  a  satisfiable  assignment.  For  F,  the  satisfiable  assignments 
are  a  =  1,  6  =  c  =  -  and  a  =  0,  6  =  1,  c  =  1,  where  denotes  a  don’t  care 
assignment. 

It  is  well  known  that  the  BDD  size  for  a  given  function  depends  on  the  variable 
order  chosen  for  the  function  (e.g.,  {a,b,c}  in  Figure  28).  Since  the  early  intro¬ 
duction  of  BDDs,  several  extensions  have  been  proposed  to  reduce  BDD  sizes  in 
practical  applications.  In  an  ordered  BDD  [59,  60],  the  input  variables  are  ordered, 
and  every  path  from  the  root  node  to  the  leaf  node  visits  the  input  variables  in  an 
ascending  order.  In  practice,  a  simple  topological  based  ordering  heuristic  [360] 
yields  small  size  BDDs  for  practical  Boolean  instances,  A  reduced  ordered  BDD 
is  an  ordered  BDD  where  each  node  represents  a  unique  logic  function.  Bryant 
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showed  that  the  reduced  ordered  BDD  of  a  Boolean  function  is  well-defined  and  is 
a  canonical  representation  of  the  function;  i.e.,  two  functions  are  equivalent  if  their 
reduced  ordered  BDDs  are  isomorphic  [59,  60]. 

The  DBDD  is  efficient  to  search  for  optimal  solutions  for  arbitrarily  compli¬ 
cated  Boolean  expressions.  In  VLSI  circuit  design,  many  practical  problems  require 
the  enumeration  of  all  possible  assignments  for  a  given  Boolean  formula.  The  best 
assignment  that  yields  the  minimum  cost  (e.g.,  minimal  circuit  structure,  minimum 
chip  area,  and  maximum  circuit  speed)  is  then  selected  from  these  possible  assign¬ 
ments.  Since  most  algorithms  for  satisfiability  testing  are  designed  for  finding  one 
truth  assignment,  they  are  impractical  for  selecting  an  optimal  assignment.  BDDs 
are  very  useful  in  such  situations,  since  a  simple  and  incremental  enumeration  of 
all  possible  paths  from  the  root  node  to  the  leaf  node  1  yields  all  the  truth  as¬ 
signments.  Thus,  once  the  BDD  for  a  Boolean  function  has  been  constructed,  it  is 
straightforward  to  enumerate  all  assignments  or  find  an  optimal  solution. 

The  BDD  method  can  effectively  handle  small  and  medium  size  formulas.  For 
larger  size  formulas,  a  partitioning  into  a  set  of  smaller  sub-formulas  before  applying 
the  BDD  algorithms  has  been  suggested.  This  approach  works  well  for  asynchronous 
computer  circuit  design  problems  [223,  435]. 


11.3.  The  {/mson  Algorithms.  Based  on  total  differential  of  a  Boolean  func¬ 
tion,  the  Unison  algorithm  is  capable  of  evaluating  arbitrarily  large,  complex  Boolean 
functions  [218,  490,  489].  The  Unison  algorithm  is  built  with  a  network  of  multiple 
universal  Boolean  elements  (UBEs).  The  topology  of  the  Unison  network  specifies 
the  structure  of  Boolean  functions.  By  dynamically  reconfiguring  the  UBE’s  func¬ 
tionality,  Unison  is  adaptable  to  evaluate  general  Boolean  functions  representing 
the  SAT/CSP  problems. 

The  total  differential,  dF,  of  a  Boolean  function  F  represents  the  difference  in 
the  function  value  due  to  the  difference  in  input  values.  For  a  Boolean  function 
F{x,y)  of  two  variables,  x  and  y,  the  total  differential  is  calculated  from  differences 
in  input,  dx  and  dy,  as: 

(11.1)  dF  =  Fxdx  ©  Fydy  ©  F^ydx  dy, 

where  ©  is  the  Exclusive-OR  operation  [503].  Let  F{x,y)  be  a  Boolean  function  of 
two  dependent  variables  x  and  y;  i.e.,  x  =  G{xi,X2)  and  y  =  H{yi,y2)-  Following 

(11.1) ,  the  total  differential  dF  is: 


(11.2) 

dF{G{x),H{y))  =  Fx  dG{xi ,  X2)  ©  Fy  dH{yi ,  yz)  ©  Frj,  dG{xi ,X2)dH (yi , yo 


)• 


It  can  be  observed  from  (11-2)  that  the  value  of  dF  depends  on  total  differentials 
dG  and  dH,  rather  than  the  function  values  G(xi ,  12)  and  H (yi ,  yz)-  By  recursively 
applying  (11.2)  to  dG,  dH,  and  their  dependent  variables,  the  total  differential  dF 
can  be  evaluated  based  on  only  total  differentials  of  the  independent  variables  (see 

Figure  29).  . 

The  Unison  algorithm  works  in  two  phases:  initialization  and  evaluation.  The 
initialization  phase  computes  partial  derivatives  that  determine  the  function  to  be 
evaluated  in  the  evaluation  phase.  The  partial  derivatives  are  constant  during  the 
evaluation  phase.  The  evaluation  phase  reads  input  values  and  computes  the  final 
results.  The  calculation  is  performed  in  a  bottom-up  fashion,  starting  from  the 
independent  variables. 
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Figure  29.  The  relation  between  total  differentials. 


A  computer  word  is  used  to  evaluate  one  Boolean  operation,  so  the  code  to  cal¬ 
culate  dF  would  produce  only  one  result.  With  one  computer  word,  however,  the 
computer  is  able  to  perform  many  bitwise  AND  and  bitwise  Exclusive-OR  opera¬ 
tions  in  one  instruction.  In  Unison  algorithm’s  implementation,  we  take  advantage 
of  this  machine  feature  to  increase  execution  speed  and  to  reduce  memory  space. 
In  one  of  our  implementations  on  the  NeXT  and  SUN  workstations  [489],  the 
Unison  algorithm  uses  32  bits  of  a  computer  word  to  pack  32  Boolean  operations. 
If  the  fth  bit  in  each  operand  is  initialized  to  represent  the  ith  Boolean  operation, 
then  the  zth  bit  of  dF  will  have  the  result  of  the  ith  Boolean  operation.  Each  of 
the  32  bitwise  operations  is  independent  of  the  others.  And  the  Unison  algorithm 
simultaneously  evaluates  32  Boolean  operations  in  one  machine  instruction.  The 
parallel  implementation  of  the  Unison  algorithm  is  straightforward  which  can  be 
implemented  in  any  programming  language  that  supports  bitwise  Boolean  oper¬ 
ations.  Data  structures  and  implementation  details  of  the  Unison  algorithm  are 
discussed  in  [489]. 

The  Unison  architecture  is  built  with  a  network  of  multiple  universal  Boolean 
elements  (UBEs).  The  connection  topology  of  the  Unison  network  specifies  the 
structure  of  the  Boolean  function  evaluated  by  Unison.  The  structure  of  Boolean 
functions  specifies  the  connectivity  between  Boolean  expressions  of  two  variables. 
Each  UBE  accomplishes  a  2-variable,  simple  Boolean  function  in  Unison.  The 
outputs  of  two  UBEs  can  be  used  as  inputs  to  another  UBE.  This  enables  the 
construction  of  a  network  of  UBEs  capable  of  evaluating  arbitrarily  large,  complex 
Boolean  functions.  By  dynamically  reconfiguring  the  UBE’s  functionality,  Unison 
is  adaptable  to  the  evaluation  of  different  Boolean  functions  representing  SAT/CSP 
problems.  In  Unison  architectures,  there  is  essentially  no  limit  on  the  number  of  bits 
one  would  like  to  implement.  One  can  put  as  many  UBE’s  on  a  chip  as  possible  as 
long  as  the  hardware  resource  permits.  The  detailed  implementations  of  the  Unison 
architecture,  e.g.,  its  network  structure,  UBE  structures,  and  two  CMOS  hardware 
implementations,  are  described  in  detail  in  [490,  489]. 
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Combined  with  parallel  evaluation,  partial  evaluation,  and  incremental  eval¬ 
uation  techniques,  Unison  can  be  incorporated  into  a  variety  of  search  and  opti¬ 
mization  algorithms  for  satisfiability  testing.  It  is  especially  important  in  real-time 
applications  where  hardware  processing  with  different  Boolean  functions  is  required. 
It  provides  an  efficient  approach  for  fast  non-clausal  processing  of  SAT  inputs. 

11.4.  Multispace  Search.  Many  search  and  optimization  methods  have  been 
developed  in  combinatorial  optimization,  operations  research,  artificial  intelligence, 
neural  networks,  genetic  algorithms,  and  evolution  programming.  An  optimization 
algorithm  seeks  a  value  assignment  to  variables  such  that  all  the  constraints  are 
satisfied  and  the  performance  objective  is  optimized.  The  algorithm  operates  by 
changing  values  to  the  variables  in  the  value  space.  Because  value  changing  does 
not  affect  the  formula  structure  and  the  search  space,  it  is  difficult  for  a  value  search 
algorithm  to  handle  the  pathological  behavior  of  local  minima. 

Multispace  search  is  a  new  optimization  approach  developed  in  recent  years 
[213,  226,  215].  The  idea  of  multispace  search  was  derived  from  principles  of 
non-equilibrium  thermodynamic  evolution  that  structural  changes  are  more  funda¬ 
mental  than  quantitative  changes,  and  that  evolution  depends  on  the  growth  of  new 
structure  in  biological  system  rather  than  just  information  transmission.  A  search 
process  resembles  the  evolution  process,  and  structural  operations  are  important  to 
improve  the  performance  of  traditional  value  search  methods  [213,  226,  215]. 

In  multispace  search,  any  active  component  related  to  the  given  input  structure 
can  be  manipulated,  and  thus,  be  formulated  as  an  independent  search  space.  For 
a  given  optimization  problem,  for  its  variables,  values,  constraints,  objective  func¬ 
tions,  and  key  parameters  (that  affect  the  input  structure),  we  define  the  variable 
space,  the  value  space  (i.e.,  the  traditional  search  space),  the  constraint  space,  the 
objective  function  space,  the  parameter  space,  and  other  search  spaces,  respectively. 
The  totality  of  all  the  search  spaces  constitutes  a  multispace. 

The  basic  idea  of  multispace  search  is  simple.  Instead  of  being  restricted  in  the 
value  space,  the  multispace  is  taken  as  the  search  space.  In  the  multispace,  com¬ 
ponents  other  than  value  can  be  manipulated  and  optimized  as  well.  During  the 
search,  a  multispace  search  algorithm  not  only  alters  values  in  the  value  space;  as 
shown  in  Figure  30,  it  also  walks  across  the  variable  space  and  other  active  spaces, 
changes  dynamically  the  input  structure  in  terms  of  variables,  parameters,  and 
other  components,  and  constructs  systematically  a  sequence  of  structured,  interme¬ 
diate  instances.  Each  intermediate  instance  is  solved  by  an  optimization  algorithm, 
and  the  solution  found  is  used  as  the  initial  solution  to  the  next  intermediate  in¬ 
stance.  By  interplaying  value  optimization  with  structured  operations,  multispace 
search  incrementally  constructs  the  final  solution  to  the  search  instance  through 
a  sequence  of  structured  intermediate  instances.  Only  at  the  last  moment  of  the 
search,  the  reconstructed  instance  structure  approaches  the  original  instance  struc¬ 
ture,  and  thus  the  final  value  assignment  represents  the  solution  of  the  given  search 
input. 

Multispace  search  algorithm  combines  traditional  optimization  algorithms  with 
structural  multispace  operations.  In  each  search  step,  multispace  search  performs 
two  fundamental  operations;  a  traditional  value  search  and  the  structural  reconfig¬ 
uration  of  the  intermediate  instance  during  each  individual  search  phase.  According 
to  the  active  event  in  the  scrambling  schedule  [213,  214],  the  search  process  en¬ 
ters  a  specified  search  space  and  performs  structural  operations  to  the  intermediate 
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Value  space 


Figure  30.  In  the  value  space,  a  traditional  search  process 
(dashed  line)  cannot  pass  a  “wall”  of  high  cost  search  states 
(hatched  region).  It  fails  to  reach  the  final  solution  state,  F.  A 
xnultispace  search  process  (solid  lines)  scrambles  across  different 
search  spaces.  It  could  bypass  this  “wall”  through  the  other  search 
spaces. 

instance  structures,  followed  by  a  traditional  value  search  that  optimizes  the  con¬ 
structed  intermediate  instance.  The  resulting  intermediate  solution  is  then  used  as 
the  initial  instance  to  the  next  phase  of  multispace  search. 

The  major  structural  operations  in  multispace  search  [213,  214]  include  mul¬ 
tispace  scrambling  [214,  226],  extradimension  transition  (e.g.,  air  bridge,  real  di¬ 
mension,  and  extra  dimension)  [212,  216,  217],  search  space  smoothing  [221], 
multiphase  search  [485,  488,  212,  433,  542,  208,  222],  local  to  global  passage 
[208,  208],  tabu  search  [199],  and  perturbations  (e.g.,  jumping,  tunneling,  climb¬ 
ing,  and  annealing)  [209,  210,  212,  217,  306). 

In  the  next  two  subsections  we  describe  tw'o  preprocessing  methods  for  satisfia¬ 
bility-testing  in  multispace  search:  partitioning  input  size  and  partitioning  variable 
domain. 

11.5.  Partitioning  to  Reduce  Input  Size.  Due  to  excessive  computing 
time,  a  large  size  NP-hard  problem  is  difficult  to  solve.  Partitioning  a  large  input 
into  a  set  of  smaller  sub-instances  may  permit  efficient  solution  of  the  input.  There 
are  two  partitioning  methods,  each  consisting  of  a  partitioning,  a  conquer,  and 
an  integration  procedure.  For  constructive  partitioning  (e.g.,  divide  and  conquer), 
partitioning,  conquer,  and  integration  procedures  are  well  defined  and  easy  to  im¬ 
plement.  For  destructive  partitioning,  it  is  difficult  to  design  the  partitioning  and 
integration  procedures. 

We  give  an  industrial  case  study  that  requires  a  SAT  solver.  The  SAT  solver 
uses  an  efficient  input  size  partitioning  as  a  preprocessing  step.  This  problem  arises 


ALGORITHMS  FOR  THE  SATISFIABILITY  (SAT)  PROBLEM:  A  SURVEY 


81 


in  asynchronous  circuit  design.  Asynchronous  circuits  are  indispensable  in  many 
low  power  and  high  performance  digital  computer  systems.  Due  to  their  important 
applications  in  mobile,  portable,  and  military  communication  systems,  there  has 
been  great  interest  in  the  automated  design  and  synthesis  of  asynchronous  circuits 
[89,  328,  348,  525].  The  design  of  asynchronous  control  and  interface  circuits, 
however,  has  proven  to  be  an  extremely  complex  and  error-prone  task.  The  core 
problem  in  asynchronous  circuit  synthesis  can  be  formulated  as  an  instance  of 
SAT  to  satisfy  the  complete  state  coding  (CSC)  constraints,  i.e.,  the  SAT-Circuit 
problem  [526]. 

In  this  practical  application  problem,  an  optimal  solution  with  minimal  circuit 
layout  area  is  sought.  An  incomplete  SAT  solver  such  as  local  search,  unfortunately, 
does  not  guarantee  an  optimal  solution,  and  therefore,  is  not  applicable.  Previous 
researchers  used  efficient  resolution  and  branch-and-bound  procedures  to  handle 
the  SAT-Circuit  problem.  For  most  asynchronous  circuit  design  problems,  unfortu¬ 
nately,  they  were  not  able  to  find  an  optimal  solution  and,  for  difficult  asynchronous 
circuit  design  problems,  they  could  not  locate  even  one  solution. 

Gu  and  Puri  have  recently  developed  a  partitioning  technique  for  satisfiability 
testing  and  applied  it  to  asynchronous  circuit  design  [214,  223,  435].  The  parti¬ 
tioning  preprocessor,  at  the  beginning,  decomposes  a  large  size  SAT  formula  that 
represent  the  given  asynchronous  circuit  design  into  a  number  of  smaller,  disjoint 
SAT  formulas.  Each  small  size  SAT  formula  can  be  solved  efficiently.  Eventually, 
the  results  of  these  sub-formulas  are  integrated  together  and  contribute  to  the  so¬ 
lution  of  the  original  formula.  This  preprocessor  avoids  the  problem  of  solving  very 
large  SAT  formulas  and  guarantees  to  finding  one  best  solution  in  practice.  This 
partitioning  preprocessing  is  destructive  since,  during  the  search,  extra  variables 
are  introduced  to  resolve  the  critical  CSC  constraints.  Furthermore,  they  built  a 
complete,  incremental  SAT  solver  based  on  binary  decision  diagrams  (BDD).  Their 
system  is  able  to  find  an  optimal  solution  to  the  asynchronous  circuit  design  prob¬ 
lem  efficiently. 


11,6.  Partitioning  Variable  Domains.  A  variable  domain  contains  values 
to  be  assigned  to  variables.  The  size  of  a  variable  domain,  along  with  the  number  of 
variables,  determine  the  computational  complexity  of  an  optimization  algorithm. 
From  a  theoretical  point  of  view,  even  a  small  reduction  in  the  variable  domain 
would  result  in  significant  improvements  in  computing  efficiency.  It  is,  however, 
difficult  to  make  use  of  variable-domain  reduction  techniques  in  solving  optimization 
problems.  Recently,  Wang  and  Rushforth  have  studied  mobile  cellular  network 
structures  and  developed  a  novel  variable-domain  reduction  technique  for  channel 
assignment  in  these  networks  [542,  543]. 

The  rapid  growth  of  mobile  cellular  communication  services  has  created  a  direct 
conflict  to  the  limited  frequency  spectrum.  Channel  assignment  is  an  important 
technique  to  the  efficient  utilization  of  frequency  resource  for  mobile  cellular  com¬ 
munications.  Among  several  channel  assignment  problems,  the  fixed  channel  as¬ 
signment  (FCA)  is  essential  to  the  design  and  operation  of  cellular  radio  networks. 
An  FCA  algorithm  assigns  frequency  channels  to  calls  such  that  the  frequency  sep¬ 
aration  constraints  are  satisfied  and  the  total  bandwidth  required  by  the  system 
is  minimized.  By  encoding  the  constraints  into  clauses,  the  problem  becomes  an 
instance  of  SAT.*^  For  a  given  cellular  communication  system,  there  are  numerous 
ways  to  assign  a  channel  to  a  call  request.  An  optimal  channel  assignment  decision 
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can  significantly  improve  the  cellular  system  capacity  without  requiring  extra  cost. 

For  a  fixed  mobile  cellular  system,  the  capacity  of  the  cellular  system  is  mainly 
determined  by  the  performance  of  the  channel  assignment  algorithms. 

Wang  and  Rushforth’s  channel  assignment  algorithm  was  developed  based  on 
the  structure  of  cellular  frequency  reuse  patterns.  Using  their  variable  domain 
partitioning  technique,  they  partition  a  mobile  cellular  network  with  larger  variable 
domain  into  two  netw’orks:  a  minimum  network  with  a  fixed  and  small  variable 
domain  (due  to  the  known  frequency  reuse  patterns)  and  a  difference  network  with 
an  even  smaller  variable  domain  [542,  543].  Channels  are  assigned  separately  to 
the  minimum  network  and  to  the  difference  network,  and  the  superposition  of  these 
tw’O  assignments  constitutes  an  assignment  to  the  original  network. 

Because  this  variable  domain  partitioning  approach  decomposes  an  instance 
of  a  channel  assignment  problem  with  a  large  number  of  assignments  into  two 
separate  channel  assignment  sub-instances  with  considerably  smaller  numbers  of 
assignments,  it  dramatically  reduces  the  computational  complexity  and  thus  the 
computing  efficiency  for  solving  given  inputs,  in  addition  to  the  significantly  im¬ 
proved  solution  results.  This  novel  partitioning  technique  can  be  applied  to  solve 
the  channel  assignment  problem  with  any  existing  channel  assignment  algorithms. 
During  numerous  channel  assignment  experiments,  this  algorithm  outperformed  all 
available  algorithms  for  solving  the  practical  channel  assignment  problem  bench¬ 
marks.  Experimental  evidence  suggests  that  this  partitioning  approach  is  both 
efficient  and  effective. 

11.7.  Parallel  SAT  Algorithms  and  Architectures.  Many  parallel  SAT/CSP 
algorithms  have  been  developed.  In  a  recent  survey  [218],  the  following  parallel 
algorithms  for  solving  SAT  were  discussed: 

1.  1987:  Parallel  DP  algorithms 

2.  1986:  Parallel  discrete  relaxation  chips 

3.  1987:  Parallel  backtracking  architecture 

4.  1987:  Parallel  local  search  algorithm 

5.  1989:  Parallel  interior  point  method 

6.  1990:  Parallel,  differential,  non-clausal  inference 

7.  1991:  Parallel  aP  relaxation 

8.  1991:  Parallel  global  optimization 

9.  1092:  Neural  network  approach 

10.  1993:  Multiprocessor  local  search 

Some  of  ideas  of  these  techniques  are  described  in  this  paper. 

For  the  following  two  reasons,  algorithms  running  on  loosely-coupled,  multipro¬ 
cessor  parallel  computers  offer  limited  performance  improvements  for  solving  SAT. 
First,  in  the  worst  case,  a  SAT  algorithm  may  suffer  from  the  exponential  growth  in 
computing  time.  In  order  to  solve  a  SAT  formulas  effectively,  we  will  need  a  com¬ 
puter  that  has  much  larger  speedup  than  what  is  available  today.  This  computer 
will  require  the  integration  of  at  least  a  few  million  processors  in  a  tightly-coupled 
manner.  This  is  infeasible  in  the  current  computer  system  integration  technology'. 

Second,  as  the  processor  gets  much  faster,  the  communication  overhead  among 
processors  in  a  parallel  machine  becomes  a  bottleneck,  which  may  often  take  70  %  to 
even  90  %  of  the  total  computing  time  [291].  Ideally  one  w'ould  expect  the  speedup 
on  a  parallel  computer  to  increase  linearly  with  increasing  number  of  processors. 
Due  to  serious  off-processor  communication  delays,  after  certain  saturation  point, 
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adding  processors  does  not  increase  speedup  on  a  loosely-coupled  parallel  machine. 
Processor  communication  delay  also  makes  process  creation,  process  synchroniza¬ 
tion,  and  remote  memory  access  very  expensive  in  a  loosely-coupled  multiprocessor 
system.  For  this  reason,  the  speedup  on  a  multiprocessor  is  normally  less  than 
the  number  of  processors  used.  With  simple  SAT  algorithms,  however,  speedup  is 
sometimes  greater  than  the  number  of  processors  [383].  Variable  settings  similar 
to  those  that  are  already  known  not  to  lead  to  a  solution  are  also  unlikely  to  lead 
to  a  solution.  The  obvious  methods  of  parallelizing  simple  SAT  algorithms  break 
up  the  tendency  to  search  similar  settings  at  about  the  same  time. 

From  our  experience,  tightly  coupled  parallel  computing,  which  effectively  re¬ 
duces  off-processor  communication  delays,  is  a  key  to  the  parallel  processing  of 
SAT  formulas  [218].  In  order  to  use  a  tightly-coupled  parallel  architecture  for  SAT 
computation,  one  must  map  a  computing  structure  to  the  input  structure  and  must 
reduce  the  total  number  of  sequential  computing  steps  through  a  large  number  of 
symmetrical  interactions  among  simple  processors  [206,  218].  Several  different  ap¬ 
proaches,  e.g.,  special-purpose  parallel  VLSI  architectures  [224,  225],  bit-parallel 
programming  on  sequential  machines  [207,  212,  490,  489],  and  tight  programming 
on  parallel  computer  systems,  are  promising  alternatives  in  this  direction.  These 
approaches  are  capable  of  providing  a  tight  mapping  between  a  formula  structure 
and  a  computing  structure,  resulting  in  faster  computation.  The  computational 
power  of  these  approaches  are  orders  of  magnitude  greater  than  standard  sequen¬ 
tial  algorithms  running  on  uniprocessor  machines  or  parallel  algorithms  running  on 
loosely  coupled  multiprocessors. 

Parallel  processing  does  not  change  the  worst-case  complexity  of  a  SAT  al¬ 
gorithm  unless  one  has  an  exponential  number  of  processors.  Parallel  processing, 
however,  does  delay  the  effect  of  exponential  growth  of  computing  time,  allowing 
one  to  solve  larger  size  instance  of  SAT. 

11.8.  The  AfM/h-5Ar  Algorithm.  The  problem  structures  of  real  world  prac¬ 
tical  applications  vary  significantly,  making  it  difficult  to  develop  an  efficient  SAT 
algorithm  to  solve  a  wider  range  of  the  practical  application  problems.  Many  effi¬ 
cient  algorithms  have  been  developed  for  the  SAT  problem.  They  each  can  solve 
a  class  of  problem  instances  efficiently.  Backtracking  algorithms  can  handle  some 
small  size,  hard  probiem  instances,  providing  complete  solutions.  Local  search  could 
handle  fairly  large-size  satisfiable  problem  instances  quickly.  BDD  SAT  solver  is 
able  to  solve  practical  problem  instances  with  performance  criteria.  Lagrangian- 
base  global  search  method  can  provide  solutions  to  wide  range  of  SAT  problem  in¬ 
stances.  Furthermore,  problem  size  partitioning  and  problem  domain  partitioning 
techniques  empower  the  existing  SAT  algorithms  to  solve  much  larger  size  practical 
problem  instances.  If  we  combine  the  niches  of  several  efficient  algorithms  together, 
they  may  be  able  to  handle  a  much  wider  range  of  SAT  problem  instances  efficiently. 

Another  school  of  concern  for  the  Multi-SAT  algorithm  comes  from  the  existing 
challenge  for  SAT  algorithm’s  design  and  testing.  A  good  local  search  algorithm, 
for  example,  consists  of  several  basic  components.  These  components  are  sensitive 
to  algorithm  parameter  setting,  algorithm  running  environment,  input  size,  and 
problem  structure.  When  designing  a  local  search  algorithm,  we  will  select  among 
several  min-conflicts  heuristics,  several  random  value  assignment  heuristics,  several 
random  variable  selection  heuristics,  more  than  a  dozen  partial  random  variable 
selection  heuristics,  several  multiphase  search  heuristics,  and  several  multispace 


84  JUN  GU,  PAUL  W.  PURDOM,  JOHN  FRANCO,  AND  BENJAMIN  W.  WAH 

search  met  a- heuristics.  Combined  with  hundreds  of  problem  instances/benchmarks, 
the  major  time  of  a  SAT  algorithm’s  design,  implementation,  and  testing  was  spent 
on  large  number  of  parameterized  executions,  i.e.,  running/experimenting  different 
versions  of  the  algorithm  for  different,  parameterized  problem  instances.  A  Multi- 
5AT  algorithm  can  relieve  the  load  of  this  task,  facilitating  quick  design  and  testing 
of  the  algorithm. 

Two  algorithm  integration  methods  have  been  proposed  to  integrate  different 
algorithms  into  a  coherent  and  effective  structure.  In  hybrid  algorithm  approach, 
algorithms  in  different  classes  are  integrated  together  in  a  single  algorithm.  The 
hybrid  algorithm  would  make  use  of  different  algorithmic  niches  according  to  some 
decision  procedures.  Early  practices  of  this  approach  include  combining  local  search 
with  backtracking  [206,  212]  and  combining  global  optimization  with  backtracking 
[210,  217].  The  effectiveness  of  this  type  of  algorithm  can  be  limited  due  to  the 
overheads  of  decision  making  and  algorithmic  context  switching. 

In  the  algorithm  clustering  approach  (“Future  Work”  in  [217]),  algorithms  in 
different  classes  are  implemented  and  optimized  individually  to  achieve  the  best 
performance.  Each  algorithms  is  executed  on  a  computer  and  a  cluster  of  computers 
is  used  to  execute  several  selected  algorithms  from  different  classes.  The  individual 
results  of  the  algorithms’  executions  are  hardwired  together,  producing  the  final 
result.  The  algorithm  clustering  approach  will  not  suffer  from  any  performance 
degradation  due  to  algorithm  integration,  and  thus  can  be  run  efficiently  on  a 
cluster  of  computers.  Computer  hardware  prices  continue  to  decrease,  a  cluster 
of  computers  can  be  built  in  a  cost-effective  way  (e.g.,  a  powerful  PC  can  now  be 
purchased  with  around  SI, 000).  The  only  requirement  for  clustering  computation 
is  a  multi-tasking  integration  software. 

In  a  recently  proposed  Multi- SAT  algorithm  [219],  we  select  several  efficient 
SAT  algorithms  from  different  algorithm  classes  including,  for  examples,  DPI  and 
CSAT  from  backtracking  algorithm,  SATl,  SAT3,  and  GSAT  from  local  search 
algorithm,  BDD  SAT  solver  from  binary  decision  diagram  algorithm,  and  DLM 
from  Lagrangian-base  global  search  method.  Combining  problem  size  partitioning 
and  problem  domain  partitioning  techniques,  they  together  support  an  effective 
satisfiability  testing  for  problem  instances  with  uncertain  structures,  using  “many 
stones”  to  shoot  “one  bird.”  In  addition,  they  permit  an  automated  tracking  of  a 
suitable  algorithm  structure,  allow  a  detailed  study  of  the  entire  problem  spectrum, 
and  provide  a  cost-effective  multi-tool  kit  for  practical  satisfiability  testing. 

A  basic  software  system  for  the  Multi-SAT,  Cluster,  is  shown  in  Figure  31. 
We  have  an  algorithm  tool  kit  collecting  candidate  algorithms  from  different  algo¬ 
rithms’  classes,  b.  problem  instance  database  for  user  to  select  the  problem  instances, 
a  distributed  system  software,  job  dispatcher,  for  remote  job  execution  control  and 
execution  result  collection,  and  a  network  of  computers  executing  the  selected  al¬ 
gorithms.  The  software  system  can  be  run  on  a  PC  platform  or  a  UNIX  platform 
under  interactive  graphical  interface/operations.  Users  or  system  software  can  gen¬ 
erate  a  number  of  jobs  and  then  submit  them  to  the  queue  management  system. 
The  job  are  run  on  available  machines  and  the  results  returned  to  the  controlling 
machine.  A  number  of  efficient  Cluster  software  systems  have  been  proposed  for 
the  algorithm. 
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Figure  31.  A  software  system,  Clustor,  for  Multi-SAT  algorithm. 


12.  Probabilistic  and  Average-Case  Analysis 

Probabilistic  and  average-case  analysis  can  give  useful  insight  into  the  question 
of  what  SAT  algorithms  might  be  effective  under  certain  circumstances.  Sometimes, 
one  or  more  structural  properties  shared  by  each  of  a  collection  of  formulas  may 
be  exploited  to  solve  such  formulas  efficiently;  or  structural  properties  might  force 
a  class  of  algorithms  to  require  super-polynomial  time.  Such  properties  may  be 
identified  and  then,  using  probabilistic  analysis,  one  may  hope  to  argue  that  these 
properties  are  so  common  for  a  particular  class  of  formulas  that  the  performance 
of  an  algorithm  or  class  of  algorithms  can  be  predicted  for  most  of  the  formulas  in 
the  class. 

The  main  drawbacks  of  this  approach  are:  1)  some  distribution  of  input  formu¬ 
las  must  be  assumed  and  chosen  distributions  may  not  represent  reality  very  well;  2) 
results  are  usually  sensitive  to  the  choice  of  distribution,  unlike  results  obtained  us¬ 
ing  randomized  algorithms;  3)  the  state  of  analytical  tools  is  such  that  distributions 
yielding  to  analysis  are  typically  symmetric  with  independent  components;  4)  few 
algorithms  have  yielded  to  analysis.  Despite  these  drawbacks,  probabilistic  results 
can  be  a  useful  supplement  to  worst-case  results  (which  can  be  overly  pessimistic, 
especially  for  NP-complete  problems)  in  understanding  algorithmic  behavior. 

This  section  reviews  some  notable  probabilistic  and  average-case  results^  for 
certain  SAT  algorithms.  The  results  we  present  are  based  mainly  on  two  distribu¬ 
tions,  the  l-SAT  distribution  and  average  ^-SAT  distribution  (see  Section  5),  partly 
because  these  have  been  the  most  widely  used.  Both  are  distributions  over  CNF 
formulas.  Since  the  character  of  results  is  different  for  both  distributions,  we  devote 
one  subsection  to  each. 

12.1.  Average  l-SAT  Model.  The  parameters  of  this  distribution  are  the 
number  of  clauses  m,  the  number  of  variables  n  from  which  clauses  are  constructed, 
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and  the  probability  p(n,m)  that  an  unnegated  variable  or  negated  variable  appears 
in  a  given  clause  (see  Section  5.1  for  more  details).  Since  variables  are  placed  in 
clauses  independently,  it  is  possible  that  null  clauses,  clauses  with  complementary 
literals  (tautological  clauses),  or  unit  clauses  exist  in  a  random  formula.  This  does 
not  mimick  reality  very  well.  How’ever,  the  mathematics  associated  with  average¬ 
time  analyses  for  average  /-SAT  models  is  usually  tractable.  It  would  be  straight¬ 
forward  but  tedious  to  modify  average  /-SAT  calculations  to  account  for  no  clauses 
of  length  0  or  1,  but  such  results  are  unknown  to  us.  In  addition,  tautological 
clauses  exist  with  high  probability  only  over  part  of  the  parameter  space. 

Results  presented  below  are  asymptotic  (that  is,  they  apply  when  n,m  oo). 

Satisfiable  and  Unsatisfiable  Formulas.  The  following  results  highlight 
those  regions  of  the  parameter  space  where  random  formulas  are  unsatisfiable  or 
satisfiable  with  high  probability.  It  is  easy  to  see  that  the  average  number  of  lit¬ 
erals  in  a  clause  is  2pn  and  the  average  number  of  times  a  variable  appears  in  a 
random  formula  is  2pm.  If  pn  >  ln(m),  a  random  truth  assignment  satisfies  a 
random  formula  in  probability  [165],  and  if  pn  =  cln(m)  :  1  >  c  >  1/2,  and 
limn,m->oo  <  oc,  1  >  e  >  0,  a  random  formula  is  satisfiable  in  prob¬ 

ability  [167].  If  pn  <  ln(m)/2,  a  random  formula  contains  an  empty  clause,  and 
therefore  is  unsatisfiable,  in  probability.  Thus,  the  only  region  of  the  parameter 
space  where  random  formulas  may  be  difficult,  in  a  probabilistic  sense,  is  defined 
by  pn  =  cln(m)  :  1  >  c  >  1/2,  limn,m-^oo  =  oo,  1  >  e  >  0. 

Polynomial-Time  Solvable  Classes.  Many  of  the  special  polynomial-time 
solvable  classes  discussed  in  Section  10  can  be  identified  with  regions  in  the  param¬ 
eter  space  as  well.  Here  we  give  some  examples  taken  mainly  from  [171]. 

If  pn  <  :  e  >  0,  a  random  formula  is  a  Horn  formula  in  probability. 

That  is,  all  the  non-empty  clauses  are  Horn  clauses.  If  pn  <  yjn^~^  jm  :  1  >  e  >  0, 
limn,m-^oo  TTi/n  <  1,  a  random  formula  is  extended  Horn  in  probability.  If  pn  > 

:  e  >  0,  a  random  formula  is  not  extended  Horn  in  probability.  This 
implies,  when  2pn  -4  oo  (no  empty  clauses,  in  probability),  the  parameter  subspace 
where  random  formulas  are  usually  extended  Horn  is  sharply  defined.  Surprisingly, 
simple  extended  Horn  formulas  are  abundant  in  a  relatively  small  subspace  of  the 
parameter  space.  If  pn  <  l/y/rny^  :  e  >  0,  a  random  formula  is  a  simple  extended 
Horn  formula  in  probability  but  if  pn  >  l/Vrn^^  :  e  >  0,  a  random  formula  is  not 
simple  extended  Horn,  in  probability. 

Random  formulas  are  balanced,  in  probability,  only  if  pn  <  :  e  >  0. 

Thus,  when  limn,m->co  m/n  <  1,  balanced  formulas  are  generated  in  abundance 
over  a  region  of  the  parameter  space  that  is  no  larger  than  the  subspace  over  which 
random  formulas  are  extended  Horn  in  probability.  The  same  statement  is  believed 
to  hold  when  limn,m->oo  m/n  >  1. 

Weakening  SLUR  so  that  it  always  chooses  to  expand  the  true  path  of  a  selected 
variable,  if  possible,  the  parameter  subspace  where  random  formulas  can  be  solved 
by  SLUR  in  probability  is  at  least  as  large  as  given  by  the  three  regions  1)  p  <  1 
and  pn  >  31n(m)  :  limn, m^oo  m/n  >  1;  2)  p  <  1  :  limn, m->+oo  m/n  <  1;  3)  pn  < 
ln(m)/2.  This  is  because  no  clauses  containing  either  all  negated  (pure  negative 
clause)  or  all  unnegated  variables  (pure  positive  clause)  are  in  a  random  formula, 
in  probability,  in  region  1)  (see  Exploitable  Properties  below);  random  formulas  are 
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extended  Horn,  or  have  no  pure  positive  or  negative  clauses,  in  probability,  in  region 
2);  and  random  formulas  contain  empty  clauses,  in  probability,  in  region  3). 

In  summary,  the  SLUR  class,  modified  as  above,  dominates  nearly  the  entire 
parameter  space;  balanced  and  extended  Horn  formulas  are  frequently  generated 
only  when  either  the  average  number  of  occurrences  of  a  variable  in  a  formula 
tends  to  0  or  random  formulas  tend  to  have  a  large  number  of  empty  clauses;  Horn 
formulas  and  simple  extended  Horn  formulas  are  commonly  generated  over  a  small 
portion  of  the  parameter  space. 

Exploitable  Properties.  If  a  random  formula  is  in  one  of  the  special,  polynomial¬ 
time  solvable  subclasses  of  SAT  discussed  earlier,  it  can  be  dealt  with  efficiently. 
The  same  is  true  if  a  random  formula  has  one  or  more  of  certain  other  exploitable 
properties.  Three  of  these  are  described  here  (taken  from  [171]). 

A  clause  is  pure  if  it  contains  only  negated  variables  or  only  unnegated  variables. 
Call  a  formula  that  has  no  pure  clauses  a  non-P-formula.  A  satisfying  truth  assign¬ 
ment  for  any  non-P-formula  can  be  obtained  in  linear  time.  If  pn  >  (H-  e)ln(m)  : 
e  >  0,  a  random  formula  is  a  non-P-formula,  in  probability. 

A  clause  is  a  tautology  if,  for  some  variable  u,  both  v  and  v  are  in  the  clause. 
Such  clauses  may  be  removed  from  a  formula  without  affecting  the  Boolean  function 
it  expresses.  If  enough  tautological  clauses  exist  in  a  formula,  it  is  relatively  easy  to 
solve.  If  >  (1+e)  ln(m)  :  e  >  0,  all  clauses  of  a  random  formula  are  tautological, 
in  probability. 

If  all  m  clauses  of  a  formula  contain  more  than  log,  (m)  literals  then  the  formula 
must  be  satisfiable.  A  random  formula  has  this  property,  in  probability,  when 
pn  >  1.551og2(7n). 

Average-Case  Results.  Although  the  above  results  show  that  random  for¬ 
mulas  are  efficiently  solved,  in  probability,  over  nearly  all  of  the  parameter  space 
of  the  average  1-SAT  model,  they  do  not  imply  that  polynomial-average-time  al¬ 
gorithms  exist  over  a  significant  portion  of  the  parameter  space.  For  example,  if, 
out  of  a  set  of  formulas,  -  1  formulas  can  be  solved  by  algorithm  A  in 
0(n)  time  but  one  formula  requires  2"  time  using  A,  then  the  set  is  solved  bj/ 

A  in  polynomial-time,  in  probability,  but  the  average  complexity  of  A  over  the 
set  is  exponential  in  n  (assuming  all  formulas  are  equally  likely).  Thus,  A  would 
get  “stuck”  on  the  above  set  of  formulas  even  though  it  almost  always  finds  a  so¬ 
lution  to  a  random  formula  in  linear  time.  This  consideration  has  motivated  the 
average-case  analysis  of  algorithms  under  the  average  /-SAT  model.  The  results  say 
that  exploiting  some  of  the  above  properties  individually  is  not  enough  to  insure 
polynomial-time  average  complexity  but,  by  exploiting  certain  properties  collec¬ 
tively,  nearly  the  entire  parameter  space  is  covered  by  a  collection  of  algorithms 
with  polynomial-average-time  complexity.  Here  we  give  two  examples. 

Determining  unsatisfiability  from  the  existence  of  an  empty  clause  in  a  given 
formula  alone  is  not  strong  enough  to  give  polynomial-average- time  if  jm  <  ln(m)/2 
since  the  probability  that  an  empty  clause  exists  in  a  random  formula  does  not  tend 
to  1  fast  enough.  However,  the  empty  clause  check  can  be  combined  with  other 
methods  to  achieve  polynomial-average-time  complexity.  For  example,  preprocess 
a  given  formula  by  making  all  unit  resolutions  and  all  resolutions  involving  vari¬ 
ables  that  occur  in  the  formula  exactly  twice;  use  backtracking,  with  the  emptj 
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clause  check,  to  find  solutions  to  the  processed  formula.  Polynomial-average-time 
is  achieved  when  either 

1.  pn  <  (e-l-5)ln(m)/(2e):  m  =  n%  e  >  1,  and,  6  is  such  that  e-l-S  >  0;  or 

2.  2.64(1  -  +  2l3pn))  <  :  m  =  Pn,  (3  o.  constant;  or 

3.  pn  <  (l-e-S)  ln(m)/e:  m  =  n%  1  >  e  >  2/3,  and,  S  is  such  that  1-e-^  >  0; 
or 

4.  pn  <  (ln(m)/4)^/3n2/3'^:  m  =  n%  2/3  >  e  >  0  [166]. 

This  subspace  includes  nearly  all  of  the  half  plane  pn  <  ln(m)/2;  that  is,  the  region 
for  which  empty  clauses  exist,  in  probability. 

We  remark  that  the  above  algorithm  finds  all  solutions  to  a  given  formula. 

A  variant  of  the  DPL  algorithm,  called  probe-order  backtracking,  that  works 
well  for  the  half  plane  pn  >  \n{m)  exploits  the  preponderance  of  non-P-formulas 
that  results  from  generating  formulas  in  that  region  [430].  Given  formula  if 
an  empty  clause  exists  in  J*,  output  “unsatisfiable.”  If  there  is  no  clause  in  T 
containing  only  unnegated  variables,  output  “satisfiable.”  Otherwise,  select  a  clause 
in  T  containing  only  unnegated  variables  {ui,U2, For  i  =  set  Vi  to 

true,  set  ui,U2, Vi-i  to  false,  and  recursively  apply  probe-order  backtracking. 
Output  “satisfiable”  if  and  only  if  at  least  one  of  these  invocations  has  output 
“satisfiable.”  Probe  order  backtracking  runs  in  polynomial-average-time  when  pn  > 
ln(m). 

Other  interesting  results  are  found  in  [168,  414,  169,  203,  280,  428,  427, 
422]. 

Average  Number  of  Solutions.  In  the  average  Z-SAT  model,  the  average 
number  of  solutions  per  formula  is  approximately 

exp[n  In  2  -}-  m  ln(l  —  e"*'^^)]. 

Thus,  when  m/n  and  pn  are  such  that  the  exponent  is  negative,  formulas  have  very 
few  solutions,  but  when  the  exponent  is  positive  formulas  have  many  solutions,  on 
average.  When  mjn  is  below  ln(2),  the  average  number  of  sub-formulas  generated 
by  simple  backtracking  is  about  the  same  as  the  average  number  of  solutions.  When 
m/n  is  above  ln(2),  small  values  of  pn  still  lead  to  few  sub-formulas  and  large  values 
lead  to  a  huge  number  of  sub-formulas,  but  there  is  an  intermediate  range  of  values 
where  the  average  number  of  solutions  is  near  zero  while  the  average  number  of 
nodes  is  an  exponential  function  of  n  [428]. 

The  average-time  analysis  for  backtracking  is  done  for  a  version  of  the  algorithm 
that  finds  all  solutions.  When  one  wants  just  one  solution,  there  is  no  need  for  the 
algorithm  to  solve  the  second  sub-formula  in  those  cases  where  the  first  sub-formula 
has  a  solution.  So  far  no  analysis  has  shown  just  how  much  time  can  be  saved 
by  stopping  early.  Since  stopping  early  can  have  an  effect  only  on  formulas  that 
have  solutions,  the  analysis  in  [428]  shows  that  there  is  a  considerable  range  of  pn 
values  where  simple  backtracking  takes  exponential  average  time  whether  or  not 
the  algorithm  stops  at  the  first  solution. 
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Additional  Commentary.  An  average  Z-SAT  analysis  of  unit  clause  back¬ 
tracking  [422]  shows  that  the  conditions  under  which  it  is  fast  or  slow  are  similar 
to  the  conditions  under  which  simple  backtracking  is  fast  or  slow.  Again,  the  an¬ 
alyzed  version  of  the  algorithm  finds  all  solutions.  However,  for  moderate  values 
of  m/n  there  is  a  range  of  pn  values  where  simple  backtracking  takes  exponential 
time  but  unit  clause  backtracking  takes  polynomial  time.  For  small  values  of  m/n 
unit  clause  backtracking  has  no  significant  advantage  because  the  number  of  solu¬ 
tions  controls  the  running  time,  and  for  large  values  of  m/n  it  has  little  advantage 
because  interesting  formulas  occur  wdth  large  pn  values,  and  so  unit  clauses  are 
rare. 

The  average  Z-SAT  analysis  of  probe  order  backtracking  shows  that,  in  addition 
to  being  fast  under  conditions  where  simple  backtracking  is  fast,  it  is  fast  under 
various  other  conditions.  It  is  fast  when  pn  is  below  1.  When  m/n  is  small,  the 
typical  Z-SAT  formula  does  not  use  most  of  the  variables.  Thus,  most  formulas 
with  one  solution  have  an  exponential  number  of  solutions  (one  for  each  setting  of 
the  unused  variables).  Simple  backtracking  takes  no  advantage  of  variables  that 
do  not  appear  in  the  formula,  but  clause  order  backtracking  does.  Clause  order 
backtracking  is  also  fast  when  pn^  is  large  compared  to  in m  -h  Inn.  When  p  is  this 
large,  setting  just  a  few  variables  (to  a  random  setting)  tends  to  satisfy  all  of  the 
clauses.  Clause  order  backtracking  notices  this  while  simple  backtracking  does  not. 

No  average  Z-SAT  analysis  has  been  done  for  shortest  clause  backtracking.  (See 
[373]  for  a  partial  analysis  of  the  Z-SAT  case.)  It  clearly  has  all  the  advantages  of 
unit  clause  backtracking  and  it  should  be  much  faster  when  pn  is  large,  but  it  is 
hard  to  know  just  how  much  faster.  The  first  four  prize  winning  entries  in  the 
1992  SAT  competition  all  used  shortest  clause  backtracking  [70]  with  refinements 
to  decide  which  of  the  various  variables  from  shortest  clause  to  select.  (The  fifth 
prize  winning  entry  used  a  form  of  hypergraph  searching.) 

The  pure  literal  rule  algorithm  is  one  of  the  first  to  have  its  average  time 
computed  [66,  201,  203,  427].  It  has  the  essence  of  the  pure  literal  rule  from 
the  DP  procedure  [118];  by  removing  most  of  the  good  features,  an  analyzable 
algorithm  is  obtained.  Although  one  would  never  use  this  algorithm  in  practice 
(other  simple  algorithms  are  much  better)  it  rapidly  solves  a  wide  class  of  formulas 
in  polynomial  average  time,  but  does  not  find  all  solutions.  It  played  an  important 
role  in  the  early  history  of  average-time  analysis  of  SAT  algorithms  because  its 
analysis  is  so  simple,  and  the  cases  where  it  is  fast  are  so  different  from  those  of 
simple  backtracking  (the  other  simple  to  analyze  SAT  algorithm). 

The  almost  pure  literal  algorithm  [423]  extends  this  idea  by  noting  that  when 
there  are  few  occurrences  of  a  literal,  then  assigning  a  value  that  makes  that  literal 
false  leads  to  a  sub-formula  that  is  almost  a  subset  of  the  sub-formula  obtained 
by  setting  the  literal  to  true.  Thus,  if  a  formula  contains  one  clause  that  has  the 
only  occurrence  of  a  literal,  any  solutions  to  the  false  sub-formula  that  are  not  also 
solutions  to  the  true  sub-formula  have  false  values  for  all  remaining  literals  in  the 

special  clause.  ^ 

Another  case  where  resolution  does  not  increase  the  input  size  is  when  a  vanab  e 
has  one  positive  and  one  negative  occurrence.  Franco  used  this  idea  plus  the  pure 
literal  rule  to  develop  an  algorithm  that  is  fast  for  small  m  so  long  as  p  is  not 
too  large  [166].  Using  this  algorithm  for  small  p  and  probe  order  backtracking  for 
large  p  leads  to  an  algorithm  that  is  fast  for  m  <  times  logarithm  factors. 
More  clever  algorithms  based  on  the  same  ideas  combined  with  better  analyses  will 
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probably  lead  to  an  algorithm  that  is  fast  when  m  is  smaller  than  a  constant  times 
n. 


12.2.  The  /-SAT  Model.  The  parameters  of  this  distribution  are  the  num- 
ber  of  clauses  m,  the  number  of  variables  n  from  which  clauses  are  constructed,  and 
the  number  of  variables  I  in  each  clause.  Clauses  are  constructed  independently. 
A  clause  is  uniformly  given  by  a  set  of  I  distinct  variables  that  are  negated  inde¬ 
pendently  with  probability  1/2.  Thus,  it  is  not  possible  that  null  clauses  or  clauses 
with  complementary  literals  exist  in  a  random  formula. 

The  probabilistic  analysis  of  SAT  algorithms  using  the  /-SAT  model  often  seems 
to  be  more  difficult  than  using  the  average  /-SAT  model.  Some  of  this  difference  is 
associated  with  the  structure  of  sub-formulas  generated  as  a  result  of  assigning  a 
value  to  a  variable  on  an  iteration  of  a  particular  algorithm.  If  such  sub-formulas 
are  distributed  according  to  the  same  model  as  the  original  formula,  the  analysis 
can  proceed  easily.  In  the  case  of  the  /-SAT  model,  statistical  dependence  between 
clauses  after  an  iteration  often  prevents  this.  A  notable  exception,  however,  is  in 
the  analysis  of  variants  of  the  unit  clause  rule. 

Another  reason  for  the  relative  success  of  analysis  under  the  average  /-SAT 
model  is  many  algorithms  that  are  unworkable  under  the  /-SAT  model  are  effective 
under  the  average  /-SAT  model.  A  notable  example  is  the  probe  order  backtracking 
algorithm  of  the  previous  section.  Under  the  average  /-SAT  model,  ii  pn  >  ln(m), 
purely  negative  or  positive  clauses  are  rare  so  probe  order  backtracking  works  well. 
How*ever,  in  the  case  of  the  /-SAT  model,  negative  clauses  and  positive  clauses  make 
up  a  fixed  percentage  of  input  clauses,  so  probe  order  backtracking  is  ineffective  in 
this  case. 

In  what  follows,  when  we  refer  to  /-SAT,  we  assume  /  >  3  unless  specifically 
stated  (as  in,  for  example,  2-SAT). 

Satisfiable  and  Unsat isfiable  Formulas.  It  is  easy  to  show  that  random 
/-SAT  formulas  are  unsatisfiable,  in  probability,  if  m/n  >  -l/log2(l  -  2”^) 

2^  [168,  414].  It  has  also  been  shown  that  a  random  2-SAT  formula  is  satisfiable, 
in  probability,  if  m/n  <  1  [86,  200].  This  implies  that  random  /-SAT  formulas  are 
satisfiable,  in  probability,  if  m/n  <  1.  The  gap  between  1  and  -l/log2(l  -  2“^ 
has  intrigued  a  number  of  researchers.  The  question  is  whether  there  is  some 
function  /(/)  such  that,  for  large  n,m,  if  m/n  <  /(/)  then  random  /-SAT  formulas 
are  satisfiable,  in  probability,  and  if  m/n  >  /(/)  then  random  /-SAT  formulas  are 
unsatisfiable,  in  probability.  Several  results  have  shaved  some  of  the  gap  from  above 
and  below  but  the  question  is  still  open  for  /  >  2.  For  the  2-SAT  case,  /(2)  =  1  [86, 
200].  For  the  3-SAT  case,  from  above,  it  is  known  that  random  /-SAT  formulas 
are  unsatisfiable,  in  probability,  if  m/n  >  4.758  [302].  This  has  been  recently 
improved  to  4.64  [150,  309].  From  below,  for  /  >  2,  it  is  known  that  random 
/-SAT  formulas  are  satisfiable,  in  probability,  if  m/n  <  max{2^ /{Al),l}  [91].  This 
result  comes  wdth  an  algorithm  for  SAT  (explained  in  Algorithms  below)  that  finds 
a  solution  in  polynomial  time,  almost  always.  For  3-SAT,  this  has  been  improved 
to  m/n  <  3.003  [179]  (algorithm  explained  in  Algorithms  below'), 

Polynomial-Time  Solvable  Classes.  /-SAT  formulas  that  are  members  of 
certain  polynomial-time  solvable  classes  are  not  generated  frequently  enough,  for 
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interesting  ratios  min,  to  assist  in  determining  satisfiability.  This  is  unlike  the 
situation  for  the  average  /-SAT  model.  We  illustrate  with  a  few  examples. 

The  probability  that  a  clause  is  Horn  is  (Z  +  l)/2'.  Therefore,  the  probability 
that  a  random  Z-SAT  formula  is  Horn  is  ((Z  +  l)/2')"  which  tends  to  0  for  any 
fiuxed  Z.  A  formula  is  hidden  Horn  if  there  is  a  set  of  variables  (a  switch  set)  whose 
literals  can  all  be  reversed  to  yield  a  Horn  formula.  Regardless  of  switch  set,  there 
are  only  Z  -I- 1  out  of  2*  ways  (negation  patterns)  that  a  random  clause  can  become 
Horn.  Therefore,  the  expected  number  of  successful  switch  sets  is  2”((Z  +  l)/2‘)'” 
which  tends  to  0  if  m/n  >  1/(Z  -  log2{Z  -h  1)).  Thus,  random  Z-SAT  formulas  are 
not  hidden  Horn,  in  probability,  if  m/n  >  1/(Z  -  log2(Z  +  1)). 

Associated  with  a  q-Horn  formula  (see  Section  10.5)  is  a  partition  Ci ,  C2  of 
clauses,  and  a  partition  Vi ,  Vo  of  variables  such  that  no  clause  in  Ci  has  a  variable  in 
V2  and,  for  each  clause  in  Co,  there  are  at  least  one  and  at  most  two  variables  taken 
from  V2.  The  probability  that  a  particular  pairwise  partition  has  this  property 
can  be  computed.  Multiplying  by  the  number  of  pairwise  partitions  gives  the 
expected  number  of  such  partitions  which  is  an  upper  bound  on  the  probability 
that  one  exists.  We  find  that  no  such  partitions  exist  with  [V'll  <  (lVi|  4- 1^20/2,  in 
probability,  if  m/n  >  l/log2(2'+V(i^-^  +  2))  =  l/(Z-log2(Z“-Z  +  2)  +  l).  Coupled 
with  the  above  hidden  Horn  result  on  Z-SAT  formulas,  we  have  the  remarkable  result 
that  random  Z-SAT  formulas  are  not  q-Horn,  in  probability,  if  m/n  >  2/(Z  -  log2  (Z -f 
1)).  This  bound  can  be  reduced  considerably,  however  the  point  we  make  is  that, 
for  large  enough  Z,  even  the  following  simple  algorithm  is  more  effective,  in  some 
probabilistic  sense,  on  random  Z-SAT  formulas  than  looking  for  q-Horn  formulas: 
randomly  remove  all  but  2  literals  from  every  clause;  solve  the  resulting  2-SAT 
formula;  if  it’s  satisfiable,  return  a  satisfying  truth  assignment,  otherwise  give  up. 


Algorithms.  We  mention  the  two  best  positive  results  to  date  and  one  negative 
result.  The  first  algorithm,  called  SC  for  Short  Clause,  iteratively  selects  a  variable 
and  assigns  it  a  value  until  either  a  solution  is  found  or  it  gives  up  because  it  has 
reached  a  dead  end.  Such  an  assignment  may  satisfy  some  clauses  and  falsify  some 
literals.  There  is  no  backtracking  in  SC.  Variables  are  selected  as  follows:  if  there 
is  a  clause  with  one  non-falsified  literal,  choose  the  variable  and  value  that  satisfies 
that  clause;  otherwise,  if  there  is  a  clause  with  two  non-falsified  literals,  choose  one 
of  the  variables  and  value  that  satisfies  that  clause;  otherwise,  choose  the  variable 
arbitrarily.  This  algorithm  is  a  restricted  version  of  GUC  [83]  (Generalized  Unit 
Clause)  that  always  chooses  a  variable  and  value  that  satisfies  a  clause  with  the 
fewest  number  of  non-falsified  literals.  The  analysis  of  SC  is  given  in  [91].  The 
result  is  that  SC  does  not  give  up,  in  probability,  if  m/n  <  27(4Z). 

By  adding  a  limited  amount  of  backtracking  to  GUC,  Frieze  and  Suen  get  an 
algorithm  for  3-SAT,  called  GUCB,  that  finds  a  satisfying  assignment,  in  proba¬ 
bility,  when  m/n  <  3.003  [179].  Backtracking  is  managed  as  follows.  Consider 
the  sequence  of  variable  selections  and  assignment  up  to  a  given  iteration  h  in 
the  execution  of  GUCB.  Let  this  sequence  be  represented  as  a  list  of  variable- 
value  pairs  {(a:;ri,vi),(x7r2>V2):— >(ai7rA:V/i)}-  Suppose,  for  p  >  1,  Up  =  false, 
Vp+i  =  ...  =  Vh  =  true,  and  two  clauses  contain  one  non-falsified  literal  but  no 
truth  assignment  will  satisfy  both.  Then  set  Up+i  =  Vp+o  =  ■■■  —  Vh  —  false, 
update  all  clauses  accordingly  (satisfied  clauses  and  falsified  literals)  and  continue 
iteratively  selecting  variables  and  assigning  values. 
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Finally,  we  mention  the  important  result  in  [86]  that  resolution  proofs  must  be 
exponentially  large,  in  probability,  for  random  unsatisfiable  /-SAT  formulas  gener¬ 
ated  with  m/n  fixed.  Thus,  for  m/n  >  log2(l  -  2^0  (fixed),  resolution  requires 
exponential  time,  in  probability.  This,  of  course,  implies  that  DPL  trees  are  also 
exponential  in  size  for  m/n  >  log2(l  —  2“^). 

Other  Non-Backtracking  Heuristics  The  algorithms  SC  and  GUC  men¬ 
tioned  above  repeatedly  choose  a  variable  and  a  value  until  either  a  satisfying 
assignment  is  found  or  a  clause  becomes  falsified  in  which  case  the  algorithm  gives 
up.  The  heuristic  used  to  select  the  variable  and  value  is  strongly  associated  with 
how  often  the  algorithm  succeeds.  A  reasonable  heuristic  is  to  make  the  choice 
that  maximizes  the  number  of  assignments  satisfying  the  formula  that  remains  af¬ 
ter  the  selected  value  is  assigned  to  the  selected  variable.  Alternatively,  the  selected 
variable  and  value  might  maximize  the  expected  number  of  satisfying  assignments. 
This  expectation  can  be  approximated  as  follows.  Suppose  a  formula  has  rui  clauses 
of  i  literals  for  all  1  <  i  and  n  distinct  variables.  If  all  clauses  are  statistically  in¬ 
dependent  and  all  clauses  of  i  literals  are  equally  likely,  the  average  number  of 
satisfying  assignments  is  2^(l/2)’^^  Thus,  we  may  choose  a  vari¬ 

able  and  value  that  maximize  this  number.  Equivalently,  the  choice  may  be  made 
to  maximize  the  log  of  this  number  or  n  +  mi  log(l/2) +m2  log(3/4)  — m3  log(7/8)... 
which  is  approximately  n  +  mi(l/2)  +  m2(l/4)  -hm3(l/8)....  Removing  n,  which  is 
unimportant,  leaves  Johnson’s  heuristic  described  in  [285].  Although  this  heuristic 
has  not  been  analyzed  on  the  /-SAT  model,  experiments  have  shown  it  to  be  quite 
effective  when  used  in  conjunction  with  unit  resolution. 

13.  Performance  Evaluation 

The  most  important  measure  of  a  SAT  algorithm’s  performance  remains  its 
practical  problem-solving  ability.  For  inputs  requiring  only  one  solution,  both  com¬ 
plete  algorithms  and  incomplete  algorithms  are  applicable.  For  inputs  requiring  all 
solutions  or  an  optimal  solution,  only  complete  algorithms  will  work.  The  past  two 
decades  have  seen  the  proliferation  of  different  algorithms  for  solving  SAT:  reso¬ 
lution,  local  search,  global  optimization,  BDD  SAT  solver,  and  multispace  search, 
among  others.  Previous  experience  indicates  that  these  techniques  complement 
rather  than  exclude  each  other  by  being  effective  for  particular  instances  of  SAT. 

In  this  section,  we  summarize  the  experimental  performance  of  several  typical 
SAT  algorithms  on  some  random  instances,  DIMACS  benchmarks,  structured  in¬ 
stances,  and  practical  industrial  benchmarks.  A  fuller  version  of  SAT  algorithms’ 
benchmarking  results  will  appear  in  a  forthcoming  paper,  “Algorithms  for  the  Sat¬ 
isfiability  (SAT)  Problem:  Benchmarking,”  by  the  same  authors, 

13.1.  Experiments  on  Random  Formulas.  In  this  section,  we  give  exper¬ 
imental  results  for  the  following  SAT  algorithms  in  solving  random  /-SAT  formulas 
and  random  average  /-SAT  formulas: 

1.  SAT1.3'.  a  sequential  CiVF  local  search  algorithm  [207,  220,  211,  212]. 

2.  SATL7\  a  parallel  CAF  local  search  algorithm  [207,  212]. 

3.  SAT1.13:  a  complete  GATF  local  search  algorithm  [207,  212]. 

4.  SATI.4:  a  sequential  DNF  local  search  algorithm  [207]. 

5.  SATL8:  a  parallel  DNF  local  search  algorithm  [207]. 

6.  SATIJS:  a  complete  DiVF  local  search  algorithm  [207]. 
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7.  SATI4.6:  an  optimized,  discrete  global  optimization  algorithm  [207,  211, 
217]. 

8.  SATI4.I6:  a  complete  global  optimization  algorithm  [207,  211,  217]. 

9.  SATI4.7:  a  continuous,  global  optimization  [207,  211,  227]. 

10.  SATi4.il:  a  complete,  continuous  global  optimization  [207,  211,  210]. 

11.  DPL:  a  Davis-Putnam  algorithm  in  Loveland  form  [117]. 

12.  GSAT:  a  sequential,  greedy  local  search  algorithm  [469]. 

13.  IP:  a  parallel  interior  point  zero-one  integer  programming  algorithm  [301, 
299]. 


Real  Execution  Times.  In  Table  1  we  give  real  execution  times  of  some 
local  search  and  global  optimization  algorithms  for  solving  Z-SAT  instances.  All 
the  results  were  run  on  a  SUN  SPARC  2  workstation.  The  number  of  clauses  (m) , 
the  number  of  variables  (n),  and  the  number  of  literals  per  clause  (Z),  are  given  in 
the  first  three  columns.  Symbol  “G/L”  in  Column  4  stands  for  the  number  of  times 
that  all  the  algorithms  hit  global/local  minimum  points.  From  these  results  we  can 
observe  that,  in  terms  of  global  convergence  and  local  convergent  rate,  these  local 
search  and  global  optimization  algorithms  exhibit  desirable  convergent  properties 
and  fast  computing  speed  for  instances  in  the  table. 

Among  optimization  algorithms,  the  parallel  CNF  local  search  {SAT1.7)  algo¬ 
rithm  was  much  faster  than  the  sequential  local  search  (SAT1.3)  algorithm.  The 
5AT1.7  algorithm  had  comparable  computing  performance  with  the  DNF  parallel 
local  search  (5AT1.8)  algorithm.  Discrete  global  optimization  (5AT14.6)  algo¬ 
rithm  was  slightly  slower  than  parallel  local  search  algorithms.  Complete  local 
search  (SAT1.13)  algorithm  and  complete  global  optimization  (5AT14.16)  algo¬ 
rithm,  due  to  a  systematic  bookkeeping,  were  slightly  slower  than  parallel  local 
search  but  significantly  faster  than  the  sequential  local  search  algorithm. 

As  discussed  in  [220,  212],  beyond  a  certain  range  of  hardness,  for  example, 
for  m  =  8500,  n  =  1000,  and  Z  =  4,  the  computing  time  of  these  optimization 
algorithms  started  to  increase. 

The  experimental  results  shown  in  Table  1  were  collected  from  early  reports  in 
[207,  212,  217].  The  present  local  and  global  optimization  algorithms  are  much 
more  faster  than  their  previous  versions  [211,  212,  222]. 

Performance  Comparison  with  the  DP  Algorithm.  The  execution  re¬ 
sults  of  the  DPL  algorithm  and  some  optimization  algorithms  for  solving  Z-SAT 
instances  are  given  in  Table  2.  We  executed  each  algorithm  ten  times  and  report 
the  average  execution  times.  Because  DPL  was  slow  for  large  size  instances,  we  set 
a  maximum  execution  time  of  120  x  m/n  seconds  as  the  time  limit  of  its  execution. 
Symbol  “S/F”  in  Column  4  stands  for  DPL’s  success/failure  in  giving  an  answer 
within  such  a  time  limit.  For  DPL,  the  average  execution  time  does  not  include  the 
maximum  execution  time  limit  if  some  of  the  ten  executions  were  successful;  the 
average  execution  time  was  taken  as  the  maximum  execution  time  limit  only  if  all 
ten  executions  failed.  Symbol  “G/L”  in  Column  6  stands  for  the  number  of  times 
that  all  the  remaining  S.AT  optimization  algorithms  hit  the  global/local  minimum 
points. 
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Table  1.  Real  execution  performance  averaged  over  ten  runs 
of  some  local  and  global  optimization  algorithms  on  a 
SUN  SPARC  2  Workstation.  Time  Units:  seconds.  Symbol 
“G/L”  stands  for  the  number  of  times  that  all  the  algorithms  hit 
global/local  minimum  points. 


Problems  I 

Execution  Time  | 

m 

n 

1 

G/L 

SAT1.3 

SAT1.7 

SAT1.13 

SAT1.8 

SAT14.6 

SAT14.16 

100 

100 

3 

10/0 

0.003 

0,001 

0.001 

0.002 

0.002 

0.003 

200 

100 

3 

10/0 

0.007 

0.004 

0.010 

0.004 

0.005 

0.008 

300 

100 

3 

10/0 

0.035 

0.008 

0.015 

0.004 

0.014 

0.012 

400 

100 

3 

10/0 

0.464 

0.027 

0.145 

0.030 

0.040 

0.215 

1000 

1000 

3 

10/0 

0.036 

0.030 

0.048 

0.029 

0.033 

0.051 

1500 

1000 

3 

10/0 

0.087 

0.055 

0.078 

0.049 

0.058 

0.081 

2000 

1000 

3 

10/0 

0.192 

0.084 

0.113 

0.080 

0.093 

0.115 

2500 

1000 

3 

10/0 

0.371 

0,124 

0.158 

0.114 

0.133 

0.180 

3000 

1000 

3 

10/0 

0.872 

0,179 

0.310 

0,164 

0.241 

0.359 

3500 

1000 

3 

10/0 

6.878 

0.636 

1.008 

0.588 

0.919 

1.357 

1000 

1000 

4 

10/0 

0.026 

0.022 

0.045 

0.027 

0.029 

0.040 

2000 

1000 

4 

10/0 

0.094 

0.061 

0.103 

0.061 

0.057 

0.092 

3000 

1000 

4 

10/0 

0.239 

0.094 

0.160 

0.091 

0.109 

0.166 

4000 

1000 

4 

10/0 

0,483 

0.144 

0.230 

0.135 

0.162 

0.234 

5000 

1000 

4 

10/0 

1.004 

0.227 

0-'38 

0.210 

0.267 

0.330 

6000 

1000 

4 

10/0 

2.410 

0.383 

0  .65 

0.359 

0.388 

0.478 

7000 

1000 

4 

10/0 

5.999 

0.865 

0.:29 

0.852 

0.756 

0.840 

8000 

1000 

4 

10/0 

36.17 

1.896 

2.088 

1.821 

2.595 

2.641 

8500 

1000 

4 

10/0 

140.3 

10.79 

7.974 

10.51 

12.79 

12.12 

10000 

1000 

5 

10/0 

2.899 

0.451 

0.610 

0.393 

0.464 

0.567 

11000 

1000 

5 

10/0 

3.799 

0.489 

0.800 

0.426 

0.580 

0.750 

12000 

1000 

5 

10/0 

6.729 

0.593 

0.839 

0.505 

0.649 

0.844 

13000 

1000 

5 

10/0 

9.541 

0.761 

1.154 

0.681 

0.978 

1.064 

14000 

1000 

5 

10/0 

21.41 

1.107 

1.308 

0.969 

1.282 

1.652 

15000 

1000  1 

5 

10/0 

60.80 

1.671 

2.207 

1.429 

2.047 

2.166 

10000 

400  i 

6 

10/0 

12.58 

0.497 

0.625 

0.463  i 

0.514  ! 

0.771 

10000 

500  ; 

6 

10/0 

4.353 

0.377 

0.640 

0-342 

0.345 

0.553 

10000 

600  1 

6 

10/0 

2.571 

0.328 

0.439 

0.280 

0.331 

0.534 

10000 

700  : 

6 

10/0 

1.989 

0.284 

0.550 

0.248 

0.289 

0.491 

10000 

800  : 

6 

10/0 

1.776 

0.277 

0.494 

0.256 

0.287 

0.452 

10000 

900  ; 

6 

10/0 

1.305 

0.289 

0.523 

0.248 

0.278 

0.476 

10000 

1000 

6 

10/0 

1.140 

0.264 

0.488 

0.227 

0.269 

0.473 

20000 

1000 

7 

10/0 

3.238 

0.500 

1.124 

0.421 

0.496 

1.004 

30000 

2000 

7 

10/0 

4.110 

0.882 

1.733 

0.722 

0.910 

1.460 

40000 

3000 

7 

10/0 

5.557 

1.289 

2.382 

1.114 

1.250 

2.196 

50000 

4000 

7 

10/0 

6.793 

1.666 

3.036 

1.386 

1.632 

2.730 

60000 

5000 

7 

10/0 

7.942 

1.260 

3.719 

1.833 

1.971 

3.402 

10000 

1000 

10 

10/0 

0.143 

0.050 

0.377 

0.034 

0.048 

0.312 

20000 

2000 

10 

10/0 

0.408 

0.124 

0.821 

0.090 

0.099 

0.664 

30000 

3000 

10 

10/0 

0.726 

0.258 

1.311 

0.197 

0.179 

1-076 

40000 

4000 

10 

10/0 

0.963 

0.305 

1.826 

0.241 

0.328 

1.511 

50000 

5000 

10 

10/0 

1.262 

0.441 

2.372 

0.357 

0.395 

1.887 

From  numerous  algorithm  executions,  we  observe  that,  for  random  /-SAT  in¬ 
stances  listed  in  Table  2,  DPL  was  slow^er  than  the  rest  of  the  SAT  optimiza¬ 
tion  algorithms.  As  the  input  size  increases,  the  number  of  failures,  F,  increased 
quickly.  For  some  slightly  large  inputs,  such  as  m  =  5000,  n  =  500,  and  I  =  5, 
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Table  2.  Performance  comparison  averaged  over  ten  runs  between 
a  DPL  and  some  optimization  algorithms  on  a  SUN  SPARC  2 
workstation  for  solving  3-SAT  problem  instances.  Time  units:  sec¬ 
onds.  Symbol  “S/F”  in  Column  4  stands  for  DPL’s  success/failure 
to  give  an  answer  within  a  time  limit  of  120  x  m/n  seconds,  whereas 
symbol  “G /L”  stands  for  the  number  of  times  that  all  the  remain¬ 
ing  SAT  optimization  algorithms  hit  global/local  minimum  points. 


Problems  I 

Execution  Time 

m 

n 

i 

_S/F 

DPL 

■ma 

SAT1.7 

SATl. 13 

SATl. 8 

SAT14.6 

SAT14.16 

500 

500 

3 

2,159 

10/0 

0.021 

o.oli 

0.013 

0.027 

750 

500 

3 

2.916 

10/0 

0.015 

0.013 

0.033 

1000 

500 

3 

3.657 

10/0 

0.048 

0.032 

0.031 

0.047 

500 

3 

9/1 

5.797 

10/0 

0.044 

0.067 

0.072 

500 

3 

6/4 

9.147 

10/0 

0.117 

0.115 

0.108 

500 

4 

4.684 

10/0“ 

0.026 

0.024 

0.038 

500 

4 

mEam 

7.960 

10/0 

0.075 

0.043 

■iAU 

0.066 

500 

4 

8/2 

10.27 

10/0 

0.066 

0.062 

■im 

2500 

500 

4 

2/8 

15.96 

10/0 

0.085 

HUH 

0.152 

3000 

500 

4 

1/9 

46.33 

10/0 

0.118 

0.115 

0.153 

0.234 

500 

5 

10/0 

16.90  i 

tSBSi 

0.094 

0.082 

0.139 

Kii9 

500 

5 

5/5 

28.39  ; 

0.119 

0.144 

0.188 

500 

5 

EOEI 

10/0 

0.180 

0.253 

0.196 

0.288 

Kisn 

500 

5 

0/10 

>1440 

10/0 

0.313 

0.370 

0.267 

0.313 

0.471 

7000 

500 

5 

0/10 

>1680 

10/0 

0.591 

0.575 

0.478 

0.604 

0.623 

10000 

1000 

10 

10/0 

101.8 

10/0 

0-047 

0.382 

0,038 

0.049 

0.315 

12000 

1000 

10 

10/0 

124.3 

10/0 

0.073 

0.458 

0.053 

0.052 

0.382 

14000 

1000 

10 

10/0 

145.2 

10/0 

0.077 

0.562 

0.058 

0.063 

0.430 

16000 

1000 

10 

10/0 

167.1 

10/0 

0.098 

0.596 

0.073 

0.078 

0.517 

18000 

1000 

10 

10/0 

188.6 

10/0 

0.136 

0.716 

0.102 

0.095 

0.583 

all  ten  algorithm  executions  failed  after  a  reasonably  long  time  limit.  Due  to  its 
average  run-time  complexity,  even  for  some  fairly  easy  instances,  such 
as  m  =  10000,  n  =  1000,  and  I  =  10,  DPL  took  an  excessive  amount  of  time  to  find 
a  solution.  In  comparison,  local  search  and  global  optimization  algorithms  were 
successful  for  all  ten  executions.  They  were  able  to  find  a  solution  to  the  given 
instances  efficiently. 

Table  2  suggests  that  DPL  may  not  be  a  suitable  candidate  for  large  size  random 
/-SAT  instances.  This  observation  should  not  be  generalized  to  other  application 
cases.  In  many  other  applications,  as  observed  by  others  [69,  70,  126],  DPL 
performed  very  well. 


Performance  Comparison  with.  GSAT»  Table  3  compares  the  performance 
betw’een  some  local  search  and  global  optimization  algorithms  running  on  a  SUN 
SPARC  2  workstation  and  GSAT  [469]  running  on  a  MIPS  computer  with  com¬ 
parable  computing  powder  [468].  Since  GSAT  is  essentially  a  version  of  sequential 
local  search  (i.e.,  SATl)  algorithm,  for  solving  3-SAT  instances  generated  from  the 
same  input  model  used  in  [380],  local  search  and  global  optimization  algorithms 
performed  approximately  tens  to  hundreds  times  faster  than  GSAT .  Among  them, 
parallel  DNF  local  search  (5 AT  1.8)  algorithm  and  complete  global  optimization 
(SAT14.16)  w’ere  the  best. 


Performance  Comparison  with  Interior  Point  Zero-One  Integer  Pro¬ 
gramming  Algorithm.  Recently,  Kamath  et  al,  used  an  interior  point  zero-one 
integer  programming  algorithm  to  solve  SAT  [301,  299].  They  implemented  their 
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Table  3.  Performance  comparison  between  some  optimization  al¬ 
gorithms  running  on  a  SUN  SPARC  2  workstation  and  the  GSAT 
algorithm  running  on  a  MIPS  computer  with  comparable  com¬ 
puting  power  for  the  3-SAT  problem  instances.  Time  units:  sec¬ 
onds. 


Problems 

Execution  Time  | 

m 

n 

. 1 

GSAT 

SAT1.7 

SATi.13 

SAT  1.8 

SAT14.6 

SAT14.16 

215 

50 

0.400 

0.019 

0.100 

0.006 

0.020 

0.030 

301 

70 

3 

0.900 

0.054 

0.010 

0.020 

0,020 

0.020 

430 

100 

3 

6.000 

0.336 

0.040 

0.420 

0.050 

0.370 

516 

120 

3 

14,00 

0.596 

0.810 

1.136 

0.410 

0.040 

602 

140 

3 

14.00 

0.260 

8.060 

0.750 

1.990 

0.170 

645 

150 

3 

45.00 

0.102 

0.190 

0.120 

0.040 

0.870 

860 

200 

3 

168.0 

1.776 

0.970 

0.070 

6.710 

0.490 

1062 

250 

3 

246.0 

3.106 

20.71 

0.070 

12.43 

0.090 

1275  ! 

300 

3 

720.0 

8.822 

19.66 

3.750 

19.14 

4.510 

Table  4.  Performance  comparison  between  some  optimization  al¬ 
gorithms  running  on  a  SUN  SPARC  2  workstation  and  an  interior 
point  zero-one  integer  programming  algorithm  running  on 
a  KORBX(R)  parallel/ vector  computer  for  solving  average  3-SAT 
problem  instances.  Time  units:  seconds.  Symbol  “S/F”  stands  for 
the  number  of  times  that  IP  hits  the  global/local  minimum  points, 
whereas  symbol  “G/L”  stands  for  the  number  of  times  that  the 
remaining  SAT  algorithms  hit  the  global/local  minimum  points. 


Problems  ~| 

Execution  Time  I 

m 

n 

1 

S/F 

IP 

G/L 

SAT1.7 

SAT1.13 

SAT1.8 

SAT14.6 

SAT14.16  1 

100 

50 

5 

52/0 

0.7 

10/0 

0.004 

0.001 

200 

100 

5 

70/0 

1.1 

10/0 

0.006 

0.010 

0.006 

0.005 

0.007 

400 

200 

7 

69/0 

3.5 

10/0 

0.007 

0.014 

0.007 

0.007 

0.018 

800 

400 

10 

31/0 

5.6 

10/0 

0.009 

0.034 

0.009 

0.003 

0.030 

800 

400 

7 

20/0 

7.8 

10/0 

0.014 

0.032 

0.014 

0.009 

0.026 

1000 

500 

10 

49/0 

7.4 

10/0 

0.012 

0.037 

0.012 

0.006 

0.039 

2000 

1000 

10 

10/0 

18.5 

10/0 

0.032 

0.091 

0.032 

0.019 

0.083 

2000 

1000 

7 

50/0 

21.5 

10/0  : 

0.056 

0.099 

0.056 

0.055 

0.055 

2000 

1000 

3 

49/1 

50.4 

10/0  ! 

2.657 

0.162 

2.657 

3.917 

27.19 

4000 

1000 

4 

1/1 

1085.4 

10/0 

10.63 

11.07 

10.63 

6.826 

9.555 

4000 

1000 

10 

10/0 

25.1 

10/0 

0.055 

0.189 

0.055 

0.044 

0.163 

8000 

1000 

10 

10/0 

38.0 

10/0 

0.219 

0.456 

0.219 

0.254 

0.353 

16000 

1000 

10 

10/0 

66.4  1 

10/0 

0.603 

1.042 

0.603 

0.625 

1.052 

32000 

1000 

10 

10/0 

232.4  ' 

10/0 

1.701 

2.720 

1.701 

1.611 

2.434 

algorithm  in  FORTRAN  and  C  languages  and  ran  the  algorithm  on  a  KORBX(R) 
parallel/ vector  computer  with  instances  generated  from  the  average  3-SAT  input 
model.  The  KORBX(R)  parallel  computer  operates  in  scalar  mode  at  approxi¬ 
mately  1  MFlops  and  at  32  MFlops  with  full  vector  concurrent  mode.  Their  exe¬ 
cution  results  are  given  in  Columns  4  and  5  of  Table  4. 

We  ran  local  search  and  global  optimization  algorithms  for  the  same  instances 
(listed  in  [301,  299])  on  a  SUN  SPARC  2  workstation.  The  results  are  given  in 
Table  4.  Apparently,  as  compared  to  the  interior  point  zero-one  integer  program¬ 
ming  algorithm  running  on  a  parallel  computer,  in  addition  to  improved  global 
convergence,  local  search  and  global  optimization  algorithms  were  much  simpler 
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Table  5.  WSAT  {GSAT with  random  walk)’s  real  execution  per¬ 
formance  for  hard  random  3-SAT  problem  instances  on  an  SGI 
Challenge  with  a  70  MHz  MIPS  R4400  processor.  Time  unit:  sec¬ 
onds  [472]. 


Problems 

1  GSAT 

Walk 

71 

m 

time 

flips 

R 

time 

flips 

R 

100 

430 

.4 

7554 

8.3 

.2 

2385 

1.0 

200 

860 

22 

284693 

143 

4 

27654 

1.0 

400 

1700 

122 

2.6  X  10® 

67 

7 

59744 

1.1 

600 

2550 

1471 

30  X  10® 

500 

35 

241651 

1.0 

800 

3400 

♦ 

* 

♦ 

286 

1.8  X  10® 

1.1 

1000 

4250 

*  ' 

* 

♦ 

1095 

5.8  X  10® 

1.2 

2000 

8480 

* 

* 

3255 

23  X  10® 

1.1 

Table  6.  WSAT  {GS AT  with  random  walk)’s  real  execution  per¬ 
formance  for  hard  random  S-SAT  problem  instances  on  a  PC.  Time 
unit:  seconds  [367]. 


n 

m 

inst. 

time 

flips 

solved 

ratio 

100 

430 

500 

0.18 

2803 

88% 

31,85 

200 

860 

500 

1.99 

18626 

73% 

255,85 

400 

1700 

500 

15.03 

204670 

100% 

2046,70 

600 

2550 

500 

19.59 

250464 

62% 

4013,85 

800 

3400 

500 

140.61 

1809986 

67% 

26854,39 

1000 

4250 

500 

369.88 

4633763 

57% 

81009,84 

2000 

8240 

50 

3147.26 

26542387 

16% 

1658899,19 

and  achieved  several  orders  of  magnitude  of  performance  improvements  in  terms  of 
computing  time. 

13.2.  Experiments  on  Hcird  Random  Formulas.  We  compare  the  per¬ 
formance  of  two  local  search  algorithms  and  a  tabu  search  algorithm  for  the  hard 
random  3-SAT  problem  instances  generated  from  the  mwflf  generator.  All  three 
programs  were  wTitten  in  C.  Table  5  give  the  real  execution  performance  of  the 
WSAT  {GS AT  with  random  walk)  on  an  SGI  Challenge  with  a  70  MHz  MIPS  R4400 
processor  [472,  471]. 


Tables  6  and  7  show  the  experimental  results  of  WSAT  and  TSAT  (Tabu  search 
for  SAT)  programs  written  in  C  under  Linux  1,1.59  for  PC  [367].  On  a  same 
machine,  Mazure,  Sais,  and  Gregoire  compared  the  G SAT  with  TSAT  and  found 
that  TSAT  was  more  efficient  in  most  cases.  In  addition,  TSAT  was  able  to  solve 
more  problem  instances  compared  to  the  GSAT  The  testing  for  T^ATfor  large  size 
example  with  n  =  2000  and  m  =  8240,  however,  was  terminated  at  nim  =  4.12, 
before  entering  into  the  hard  region  of  the  random  3-SAT  instances. 
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Table  7.  TSAT’s  real  execution  performance  for  hard  random 
3-SAT  problem  instances  on  a  PC.  Time  unit:  seconds  [367]. 


n 

m 

inst. 

time 

flips 

solved 

ratio 

100 

430 

500 

0.11 

1633 

93% 

17,60 

200 

860 

500 

0.73 

9678 

74% 

130,78 

400 

1700 

500 

11.51 

145710 

100% 

1457,10 

600 

2550 

500 

13.92 

167236 

65% 

2580,80 

800 

3400 

500 

99.45 

1143444 

71% 

16150,34 

1000 

4250 

500 

292.10 

3232463 

62% 

51802,29 

2000 

8240 

50 

3269.15 

29415465 

40% 

735386,63 

The  performance  of  the  SAT1.5  algorithm  [211,  222]  (Section  7.7)  is  shown 
in  Table  8.  For  hard  problem  instances  in  the  transition  region  [380],  SAT1.5  can 
solve  large-size  SAT  problem  instances  efficiently.  It  took  WSAT  on  average  3,255 
seconds  to  solve  the  n  =  2,000  and  m  =  8,480  instances  on  an  SGI  Challenge  with 
a  70  MHz  MIPS  R4400  processor.  On  a  SUN  SPARC  20  workstation,  the  SAT1.5 
algorithm  was  able  to  solve  the  same  problem  instance  in  some  530  seconds  on 
average  [222,  219].  For  hard,  large  size  problem  instances  with  n  >  5,000,  SATL5 
algorithm  was  able  to  handle  them  comfortably. 


13.3.  Experiments  on  Structured  Instances.  We  now  take  a  look  at  the 
performance  of  SAT  algorithms  for  some  structured  instances. 

Instances  Generated  from  the  iV-Queens  Problem.  To  assess  the  per¬ 
formance  of  local  search  and  global  optimization  algorithms  with  non-binary  in¬ 
stances,  we  also  tested  SAT  instances  generated  from  instances  of  the  n-queens 
problem.  Figure  32  compares  the  performance  between  DP  and  some  optimization 
algorithms.  It  also  compares  the  performance  between  DP  and  5AT14.il  [210], 
a  complete,  continuous  global  optimization  algorithm.  Due  to  expensive  floating 
point  computations,  the  execution  time  of  5AT14.il  is  higher  than  those  of  other 
discrete  local  search  and  global  optimization  algorithms. 

DIM  ACS  Instances.  For  the  same  SAT  formulas  generated  from  instances 
of  the  Boolean  inference  problem  [300],  the  performance  of  5AT1.7[212],  a  parallel 
local  search  algorithm,  and  a  simple  backtracking  algorithm  [326]  is  shown  in  Tables 
9  and  10,  respectively.  An  algorithm  may  be  effective  for  only  one  type  of  input. 
The  results  suggest  that  it  can  be  much  more  efficient  if  we  use  several  different 
types  of  algorithms  to  handle  the  same  inputs  simultaneously. 

In  Table  11,  we  compare  A2  [535]  with  WSAT,  GSAT,  and  Davis-Putnam’s 
algorithm  in  solving  the  circuit  diagnosis  benchmark  problems.  We  present  average 
execution  times  and  average  number  of  iterations  of  A2  as  well  as  published  average 
execution  times  of  WSAT,  GSAT  and  Davis-Putnam’s  method  [472].  We  did  not 
attempt  to  reproduce  the  reported  results  of  GSAT  and  WSAT,  since  the  results 
may  depend  on  initial  conditions,  such  as  the  seeds  of  the  random  number  generator 
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Figure  32.  Comparison  of  DP  with  SAT1.7,  SAT1.13, 
SAT14.6,  SAT14.16,  and  SAT14.il  for  solving  SAT  instances 
generated  from  CSP  instances 
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Table  8.  Real  execution  performance  of  the  SAT1.5  algorithm 
for  hard  random  3-SAT  problem  instances  on  a  SUN  SPARC  20 
workstation.  For  each  problem,  30  random  instances  were  tested. 
The  minimum  [Tmin),  maximum  (T^ox),  and  average  {Tmean)  ex¬ 
ecution  times  were  recorded.  “S”  indicates  the  number  of  success 
cases  of  finding  solutions  within  the  time  limit  {T-limit).  Time 
unit:  second. 


n 

m 

m/n 

S/30 

Tmtn 

Tmean 

Tmax 

T-limit 

1000 

4230 

4.2300 

20/30 

6.04 

206.90 

956.15 

1000 

1000 

4240 

4.2400 

15/30 

0.55 

223.26 

891.11 

1000 

1000 

4250 

4.2500 

14/30 

1.69 

88.370 

454.79 

1000 

1000 

4260 

4.2600 

10/30 

24.4 

243.25 

914.35 

1000 

2000 

8460 

4.2300 

12/30 

115.8 

779.44 

2069.9 

3000 

2000 

8480 

4.2400 

14/30 

17.64 

530.32 

1360.9 

3000 

2000 

8500 

4.2500 

7/30 

58.09 

789.35 

1677.6 

3000 

2000 

8510 

4.2550 

9/30 

59.08 

840.33 

2322.8 

3000 

2000 

8460 

4.2300 

15/30 

58.95 

684.32 

4508.2 

5000 

2000 

8480 

4.2400 

15/30 

15.31 

1273.5 

4057.9 

5000 

2000 

8500 

4.2500 

15/30 

112.7 

1527.8 

3644.5 

5000 

2000 

8520 

4.2600 

9/30 

123.2 

1522.9 

4338.5 

5000 

3000 

12690 

4.2300 

12/30 

430.6 

1787.2 

2876.4 

5000 

3000 

12700 

4.2333 

18/30 

122.5 

2101.5 

4479.4 

5000 

3000 

12720 

4.2400 

11/30 

270.6 

1503.6 

3840.7 

5000 

3000 

12740 

4.2467 

12/30 

229.1 

2062.9 

4807.4 

5000 

3000 

12680 

4.2267 

15/30 

356.1 

2788.3 

9510.2 

10000 

3000 

12700 

4.2333 

11/30 

503.3 

3681.1 

8247.9 

10000 

3000 

12720 

4.2400 

15/30 

30.09 

2300.3 

7002.3 

10000 

3000 

12740 

4.2467 

8/30 

563.3 

2620.5 

5330.9 

10000 

4000 

16920 

4.2300 

11/30 

739.83 

4064.5 

11498.2 

12000 

4000 

,  16930 

4.2325 

10/30 

1733.5 

5472.0 

10187.8 

12000 

4000 

16940 

4.2350 

7/30 

571.20 

1948.9 

4768.92 

12000 

4000 

16960 

4.2400 

10/30 

294.80 

3709.0 

9921.77 

12000 

5000 

21150 

4.2300 

8/30 

2024.7 

3867.9 

8134.81 

9000 

5000 

21175 

4.2350 

6/30 

1640.1 

2982.7 

4193.68 

9000 

5000 

21200 

4.2400 

3/30 

2935.8 

4435.7 

6357.65 

9000 

5000 

21225 

4.2450 

4/30 

3883.5 

6025.6 

10980.9 

15000 

10000 

41000 

4.1000 

30/30 

294.44 

1315.6 

3849.38 

20000 

10000 

41800 

4.1800 

8/18 

4294.5 

8387.9 

16654.8 

20000 

10000 

42000 

4.2000 

4/30 

963.53 

5877.8 

12020.3 

20000 

10000 

42200 

4.2200 

2/30 

9270.6 

14241.9 

19213.4 

20000 

and  other  program  parameters.  We  ran  .4-2  on  an  SGI  Challenge^  so  that  our  timing 


^Based  on  a  single-CPU  150-MHz  SGI  Challenge  with  MIPS  R4400  at  the  University  of  Illi¬ 
nois  National  Center  for  Supercomputing  .Applications,  we  estimate  empirically  that  it  is  15.4% 
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Table  9.  Performance  of  SAT  1.7  on  a  SUN  SPARC  10  worksta¬ 
tion.  Time  Units:  seconds. 


Problems  I 

Ten  Trials 

Execution  Time 

Name 

m 

n 

Global 

SAT 

Min 

Mean 

Max 

iilSal.sat 

1650 

19368 

10/10 

YES 

10.320 

125.51 

417.42 

iil6bl.sat 

1728 

24792 

10/10 

YES 

0.6100 

6.1130 

28.760 

iilScl.sat 

1580 

16467 

10/10 

YES 

0.5400 

1.8740 

3.7500 

iil6dl.sat 

1230 

15901 

10/10 

YES 

0.4900 

1.3810 

2.8200 

iilSel.sat 

1245 

14766 

10/10 

YES 

0.5300 

0.9720 

1.4800 

iil6a2.sat 

1602 

23281 

N/A 

iil6b2.sat 

1076 

16121 

10/10 

YES 

1.9000 

39.118 

102.60 

iil6c2.sat 

924 

13803 

10/10 

YES 

0.3500 

14.109 

41.650 

iil6d2.sat 

836 

12461 

10/10 

YES 

0.3300 

19.840 

52.410 

iil6e2.sat 

532 

7825 

10/10 

YES 

0.5000 

6.8830 

21.980 

ii32al.sat 

459 

9212 

10/10 

YES 

0.3600 

3.5740 

10.330 

ii32bl.sat 

228 

1374 

10/10 

YES 

0.1100 

0.7390 

1.6700 

ii32b2.sat 

261 

2558 

10/10 

YES 

0.1000 

1.9040 

4.4700 

ii32b3.sat 

348 

5734 

10/10 

YES 

1.6400 

10.559 

19.330 

ii32b4.sat 

381 

9618 

10/10 

YES 

0.5100 

2.3060 

4.7800 

ii32cl.sat 

225 

1280 

10/10 

YES 

0.0100 

0.1150 

0.4800 

ii32c2.sat 

249 

2182 

10/10 

YES 

0.0600 

0.3980 

0.9000 

ii32c3.sat 

279 

3272 

10/10 

YES 

0.6900  ^ 

5.4900 

16.850 

ii32c4.sat 

759 

20862 

10/10 

YES 

5.5200 

361.80 

1496.3 

ii32dl.sat 

332 

2703 

10/10 

YES 

0.2200 

1.0680 

3.1000 

ii32d2.sat 

404 

5153 

10/10 

YES 

0.2100 

0.9140  * 

2.1800 

n32d3,sat 

824 

19478 

10/10 

YES 

1.7100 

49.522 

109.52 

ii32el.sat 

222 

1186 

10/10 

YES 

0.0200 

0.3260 

1.0700 

ii32e2.sat 

267 

2746 

10/10 

YES 

0.0400 

0.1130 

0.3400 

ii32e3.sat 

330 

5020 

10/10 

YES 

0.4500 

5.2700 

13.910 

u32e4.sat 

387 

7106 

10/10 

YES 

0.2700 

10.734 

46.750 

ii32e5.sat 

522 

11636 

10/10 

YES 

0.4900 

23.424 

84.470 

Table  10.  Performance  of  a  simple  backtracking  algorithm 
on  a  SUN  SPARC  10  workstation.  Time  Units:  seconds. 


Name 

m 

n 

SAT 

Time 

Name 

m 

n 

SAT 

Time 

iil6al.sat 

1650 

19368 

YES 

1.285 

iil6bl.sat 

1728 

24792 

YES 

1.490 

iil6cl.sat 

1580 

16467 

N/A 

1.956 

iil6dl.sat 

1230 

15901 

YES 

1.660 

iil6el.sat 

1245 

14766 

N/A 

2.125 

iil6a2.sat 

1602 

23281 

YES 

1.430 

iil6b2.sat 

1076 

16121 

YES 

1.505 

iil6c2.sat 

924 

13803 

YES 

2.016 

iil6d2.sat 

836 

12461 

YES 

1.665 

iil6e2.sat 

532 

7825 

N/A 

2.051 

ii32al.sat 

459 

9212 

YES 

1.160 

ii32bl.sat 

228 

1374 

YES 

1.035 

ii32b2.sat 

261 

2558 

YES 

1.035 

ii32b3.sat 

348 

5734 

YES 

1.240 

ii32b4.sat 

381 

9618 

YES 

1.285 

ii32cl.sat 

225 

1280 

YES 

0.000 

ii32c2.sat 

249 

2182  ; 

YES 

1.325 

ii32c3.sat 

279 

3272 

YES 

1.240 

ii32c4.sat 

759 

20862 

YES 

1.695 

ii32dl.sat 

332 

2703 

YES 

1.035 

ii32d2.sat 

404 

5153 

YES 

1.525 

ii32d3.sat 

824 

19478 

YES 

1.755 

ii32el.sat 

222 

1186 

YES 

0.000 

ii32e2.sat 

267 

2746 

YES 

1.035 

ii32e3.sat 

330 

5020 

YES 

1.565 

ii32e4.sat 

387 

7106 

YES 

1.615 

ii32e5.sat 

522 

11636 

YES 

1.655 
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Table  11.  Comparison  of  A2’s  execution  times  in  seconds  av¬ 
eraged  over  10  runs  with  respect  to  published  results  on  some  of 
the  circuit  diagnosis  problems  in  the  DIMACS  archive,  including 
the  best  known  results  obtained  by  WSAT,  GSAT,  and  Davis- 
Putnam’s  algorithm  [472]. 


Problem 

Id 

n 

m 

1 _ _ 

WSAT 

GSAT 

DP 

SS  10/51 

SGI 

#  Iter. 

ssa7552-038 

1501 

3575 

0.228 

0.235 

7970 

2.3 

129 

7 

ssa7552-158 

1363 

3034 

0.088 

0.102 

2169 

2 

90 

* 

ssa7552-159 

1363 

3032 

0.085 

0.118 

2154 

0.8 

14 

ssa7552-160 

1391 

3126 

0.097 

0.113 

3116 

1.5 

18 

* 

•  A2:  Sun  SparcStation  10/51  and  a  150-MHz  SGI  Challenge  with  MIPS  R4400; 

•  GSAT,  WSAT  and  DP:  SGI  Challenge  with  a  70  MHz  MIPS  R4400. 


Table  12.  Comparison  of  A2’s  execution  times  in  seconds  aver¬ 
aged  over  10  runs  with  published  results  on  circuit  synthesis  prob¬ 
lems  from  the  DIMACS  archive,  including  the  best  known  results 
obtained  by  GSAT,  integer  programming,  and  simulated  anneal¬ 
ing  [472]. 


Problem 

Id. 

n 

m 

1  A2 

GSAT 

Integer 

Prog. 

SA 

SS  10/51 

SGI 

#  Iter. 

iil6al 

1650 

19368 

0.122 

0.128 

819 

2 

2039 

12 

iilGbl 

1728 

24792 

0.265 

0.310 

1546 

12 

78 

11 

iil6cl 

1580 

16467 

0.163 

0.173 

797 

1 

758 

5 

iilGdl 

1230 

15901 

0.188 

0.233 

908 

3 

1547 

4 

iilGel 

1245 

14766 

0.297 

0.302 

861 

1 

2156 

3 

•  A2*  Sun  SparcStation  10/51  and  a  150-MHz  SGI  Challenge  with  MIPS  R4400; 

•  GSAT  and  SA:  SGI  Challenge  with  a  70  MHz  MIPS  R4400; 

•  Integer  Programming:  VAX  8700. 


results  can  be  compared  to  those  of  GSAT  and  WSAT.  Our  results  show  that  A2 
is  approximately  one  order  of  magnitude  faster  than  WSAT. 

In  Table  12,  we  compare  A2  [535]  with  the  published  results  of  GSAT,  integer 
programming  and  simulated  annealing  on  the  circuit  synthesis  problems  [472].  Our 
results  show  that  A2  performs  several  times  faster  than  GSAT. 

In  Table  13,  we  compare  the  performance  of  the  three  versions  of  DLM  with 
some  of  the  best  known  results  of  GSAT  on  circuit-synthesis,  parity-learning,  some 
artificially  generated  3-SAT,  and  some  of  the  hard  graph  coloring  problems.  The 
results  on  GSAT  are  from  [473],  which  are  better  than  other  published  results. 
Our  results  show  that  DLM  is  consistently  faster  than  GSAT  on  the  “n"”  and  “par^’ 
inputs,  and  that  Ai  is  an  order-of-magnitude  faster  than  GSAT  on  some  “aim” 
inputs. 


slower  than  a  Sun  SparcStation  10/51  for  executing  A2  to  solve  SAT  benchmark  problems.  How¬ 
ever,  we  did  not  evaluate  the  speed  difference  between  a  150-MHz  SGI  Challenge  and  a  70-MHz 
SGI  Challenge  on  which  GSAT  and  WSAT  were  run. 
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Table  13.  Compaxison  of  DLM’s  execution  times  in  seconds 
averaged  over  10  runs  with  the  best  known  results  obtained  by 
GSAT  [473]  on  the  circuit-synthesis,  parity-learning,  artificially 
generated  3-SAT  instances,  and  graph  coloring  problems  from  the 
DIjVIACS  archive. 


Problem 

Identification 

n 

m 

A, 

GSAT 

SS  10/51 

Success 

Ratio 

Time 

Success 

Ratio 

aim- 1 00-2  ^0-yes  1- 1 

100 

200 

0.19 

10/10 

1.96 

9/10 

aim- 1 00-2  _0-yes  1-2 

100 

200 

0.65 

10/10 

1.6 

10/10 

aim-100-2_0-yesl-3 

100 

200 

0.19 

10/10 

1.09 

10/10 

aim-100-2-0-yesl-4 

100 

200 

0.10 

10/10 

1.54 

10/10 

As 

GSAT 

ii32b3 

348 

5734 

0.31 

10/10 

0.6 

10/10 

ii32c3 

279 

3272 

0.12 

10/10 

0.27 

10/10 

ii32d3 

824 

19478 

1.05 

10/10 

2.24 

10/10 

ii32e3 

330 

5020 

0.16 

lO/KM 

0.49 

10/10 

paj8-2-c 

68 

270 

0.06 

10/10 

1.33  n 

10/10 

par8-4-c 

67 

266 

0.09 

10/10 

0.2 

10/10 

1  ^3 

GSAT  1 

gl25.17 

2125 

66272 

1390.32 

10/10 

264.07 

7/10 

gl25.18 

2250 

70163 

3.197 

10/10 

1.9 

10/10 

g250.15 

3750 

233965 

2.798 

10/10 

4.41 

10/10 

g250.29 

7250 

454622 

1219.56 

9/10 

1219.88 

9/10 

•  Ai,  A2,  As:  Sun  SpaxcStation  10/51 

•  GSAT:  SGI  Challenge  (model  unknown) 


We  are  designing  new  strategies  to  improve  As’s  [535]  performance.  Tables  14 
shows  some  preliminary  but  promising  results  of  A3  on  some  of  the  more  difficult 
but  satisfiable  DIMACS  benchmark  inputs. 


13.4.  Experiments  on  Practical  Industrial  Benchmarks.  Performance 

of  the  SAT-Circuit  Solver  with  Partitioning  Preprocessing.  We  compare  in 
Table  15  Gu  and  Puri’s  SAT  solver  (having  a  partitioning  preprocessing)  [223] 
with  existing  algorithms  [329,  526]  for  solving  industrial  asynchronous  circuit  de¬ 
sign  benchmarks,  including  the  HP  and  Philips  benchmarks.  In  the  table,  N  and  m 
are  the  initial  number  of  states  and  initial  number  of  signals,  respectively.  Corre¬ 
spondingly,  and  are  the  final  number  of  states  and  final  number  of  signals. 
Symbol  A  indicates  the  2-level  implementation  area. 

The  experimental  results  indicate  that,  as  compared  to  the  previous  methods 
[329,  526],  the  S.4T-Circuit  solver  with  partitioning  preprocessing  achieves  many 
orders  of  magnitude  of  performance  improvement  in  terms  of  computing  time,  in 
addition  to  a  reduced  implementation  area.  For  example,  in  a  large  circuit  mrO, 
SAT-Circuit  took  2.80  seconds  to  solve  the  problem  and  yielded  a  two-level  im¬ 
plementation  area  with  41  literals.^”  In  contrast,  Lavagno  et  al.’s  algorithm  took 
1, 084.5  seconds  and  an  area  of  86  literals.  For  this  example,  Vanbekbergen  et 


Literal  here  is  a  standard  unit  measuring  layout  area. 
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Table  14.  Execution  times  in  CPU  seconds  over  10  runs  of  A3  to 
solve  some  of  the  more  difficult  DIMACS  benchmark  problems. 


Prob. 

Succ. 

Sun  SS  10/51  Seconds  | 

Id. 

Ratio 

Avg. 

Min. 

Max. 

par8-l 

10/10 

4.780 

0.133 

14.383 

par8-2 

10/10 

5.058 

0.100 

13,067 

par8-3 

10/10 

9-903 

0.350 

21.150 

par8-4 

10/10 

5.842 

0.850 

16.433 

parS-S 

10/10 

14.628 

1.167 

34.900 

parl6-l 

5/10 

11172.8 

4630.6 

20489.1 

par  16- 2 

1/10 

856.9 

856.9 

856.9 

parl6-3 

1/10 

20281.6 

20281.6 

20281.6 

parl6-4 

3/10 

3523.1 

1015.0 

7337.9 

parl6-5 

1/10 

13023.4 

13023.4 

13023.4 

parl6-l-c 

10/10 

398.1 

11.7 

1011.9 

parl6-2-c 

10/10 

1324.3 

191.0 

4232.3 

parl6-3-c 

10/10 

987.2 

139.8 

3705.2 

parl6-4-c 

10/10 

316.7 

5.7 

692.66 

parl6-5-c 

10/10 

1584.2 

414.5 

3313.2 

hanoi4 

1/10 

476.5 

476.5 

476.5 

flOOO 

10/10 

126.8 

4.4 

280.7 

f600 

10/10 

16,9 

2.1 

37.2 

f2000 

10/10 

1808.6 

174.3 

8244.7 

Program  parameters  | 

Flat  region  limit 

=  50;  A  reset  interval  =  10,000;  operation:  / 

V  =  A/1.5. 

Problem  group 

par- 16- [1-5] 

test  par  problems 

f 

hanoi4 

Tabu  length 

100 

50 

50 

50 

Increment  of  A 

1 

1/2 

1/16 

1/2 

fl^.’s  algorithm  could  not  yield  a  solution  within  3,600  seconds  and  aborted  due  to 
backtracking  limit.  For  another  benchmark  circuit  mmuO,  SAT-Circuit  solved  it  in 
0.87  seconds,  as  compared  to  a  pre-aborted  406.3  seconds  for  Vanbekbergen  et  a/.’s 
approach  [526]. 


Performance  of  a  BDD  SAT  Solver  with  Partitioning  Preprocessing. 
The  BDD  SAT-Circuit  solver  was  implemented  in  C  language.  In  this  case,  Gu 
and  Puri  tested  their  BDD  SAT-Circuit  solver  with  its  ability  to  find  all  solutions 
(therefore,  an  optimal  solution)  for  a  large  number  of  industrial  asynchronous  cir¬ 
cuit  benchmarks  including  the  HP  and  Philips  benchmarks  [223,  435].  They  also 
compared  the  performance  of  their  BDD  SAT-Circuit  solver  with  the  w'ell  known 
Lavagno  et  aVs  [329]  asynchronous  circuit  design  technique.  The  results  of  these 
experiments  are  given  in  Table  16  and  Table  17.  Table  16  compares  the  execution 
time  of  the  BDD  SAT  solver  with  the  execution  time  of  a  simple  backtracking  SAT 
algorithm  of  [326].  The  experimental  results  are  given  for  SAT  instances  generated 
from  Gu  and  Puri's  SAT  formula  partitioning  preprocessor  [223].  Since  the  BDD 
SAT-Circuit  solver  yielded  all  the  solutions,  they  normalized  the  execution  time 
of  the  backtracking  algorithm  for  all  the  truth  cissignment.  The  experimental  re¬ 
sults  (Table  16)  show  that  the  BDD  SAT  solver  outperforms  the  backtracking  SAT 
technique  for  the  practical  SAT  instances  representing  asynchronous  circuit  design. 

They  also  calculated  the  implementation  area  of  the  designed  circuits.  Table  17 
compares  their  BDD  SAT  solver  with  the  well  known  Lavagno  et  a/.’s  asynchronous 
circuit  design  technique  [329].  The  BDD  SAT-Circuit  solver  yielded  reduced  circuit 
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Table  15.  Experimental  results  comparing  the  SAT- Circuit 
solver  (with  SAT  formula  partitioning  preprocessing),  Vanbekber- 
gen  et  aVs  algorithm,  and  Lavagno  et  al.^s  algorithm,  on  practical 
circuit  benchmarks  on  a  SUN  SPARC- 2  workstation.  Time  unit: 
seconds. 


Circuit  Specifications 
Name  N  m 

Preprocessing  [223] 
Nf  m!  A  CPU 

Vanbekbergen  et  at.  [526] 
mf  A  CPU 

Lavagno  et  al.  [329] 
mf  A  CPU 

mrO 

302 

11 

469 

14 

41 

2.80 

backtrack  limit  > 

3600 

13 

86 

1084.5 

mrl 

190 

8 

373 

12 

55 

1.73 

backtrack  limit  > 

872.9 

10 

53 

237.5 

mmuO 

174 

8 

441 

11 

49 

0.87 

backtrack  limit  > 

406.3 

state  error 

mmul 

82 

8 

131 

10 

50 

0.37 

backtrack  limit  > 

101.3 

10 

37 

47.8 

sbuf-ram-write 

58 

10 

93 

12 

59 

0.36 

90 

12 

74 

5.21 

12 

35 

54.6 

vbe4a 

58 

6 

106 

8 

37 

0.19 

116 

8 

40 

0.25 

8 

41 

5.50 

nak-pa 

56 

9 

59 

10 

25 

0.20 

58 

10 

32 

0.08 

10 

41 

20.8 

pe-rcv-ifc-fc 

46 

8 

50 

9 

48 

0.24 

53 

9 

50 

0.13 

9 

62 

14.3 

ram-read-sbuf 

36 

10 

44 

11 

28 

0.15 

53 

11 

44 

0.06 

11 

23 

65.2 

alex-nonfc 

24 

6 

31 

7 

26 

0.05 

28 

7 

22 

0.03 

non- free-choice 

sbuf-send-pkt2 

21 

6 

26 

7 

20 

0.04 

27 

7 

29 

0.04 

7 

14 

8.6 

sbuf-send-ctl 

20 

6 

32 

8 

33 

0.09 

28 

8 

35 

0.03 

8 

43 

3.4 

atod 

20 

6 

26 

7 

15 

0.02 

24 

7 

16 

0.01 

7 

19 

2.9 

pa 

18 

4 

34 

6 

18 

0.12 

31 

6 

22 

0.06 

state  error 

alloc-outbound 

17 

7 

29 

9 

33 

0.09 

24 

9 

27 

0.04 

9 

23 

2.5 

wrdata 

16 

4 

20 

5 

17 

0.03 

19 

5 

18 

0.01 

5 

21 

0.9 

fifo 

16 

4 

23 

5 

15 

0.03 

20 

5 

17 

0.02 

5 

15 

0.7 

sbuf-read-ctl 

14 

6 

18 

7 

16 

0.06 

16 

7 

20 

0.01 

7 

15 

1.5 

nousc 

12 

3 

16 

4 

12 

0.01 

16 

4 

12 

0.01 

4 

14 

0.5 

vbe-ex2 

8 

2 

12 

4 

18 

0.08  * 

12 

4 

18 

0.03 

4 

21 

0.5 

nousc-ser 

8 

3 

10 

4 

9 

0.02 

10 

4 

9 

0.01 

4 

11 

0.4 

sendr-done 

7 

3 

10 

4 

8 

0.02 

10 

4 

8 

0.01 

4 

6 

0.4 

vbe-exl 

5 

2 

8 

3 

7 

0.01 

8 

3 

7 

0.01 

3 

7 

0.3 

Table  16.  Experimental  results  comparing  the  BDD  SAT- 
Circuit  solver  and  a  backtracking  SAT  algorithm,  both  with  SAT 
formula  partitioning  preprocessing,  on  practical  asynchronous  cir¬ 
cuit  benchmarks  on  a  SUN  SPARC-2  workstation.  Time  unit:  sec¬ 
ond. 


STG 

Benchmark 

Name 

BDD 

SAT 

Solver 

Backtracking 

satisfiability 

testing 

STG 

Benchmark 

Name 

BDD 

SAT 

Solver 

Backtracking 

satisfiability 

testing 

MrO 

58.3 

Mmul 

28.1 

>3,600 

SbufRamWr 

32.7 

>3,600 

Vbe4a 

1.95 

>3,600 

NakPa 

0.53 

5.4 

RamRdSbuf 

0.25 

76.8 

AlexNonFc 

0.37 

0.96 

SbufSndPkt2 

0.37 

88.06 

SbufSndCtl 

18.27 

353.6 

AtoD 

0.15 

11.88 

Pa 

0.05 

4.50 

WrData 

0.14 

0.24 

Fifo 

0.05 

0.10 

SbufRdCtl 

0.09 

0.10 

NoUsc 

0.09 

0.16 

VbeEx2 

3.94 

0.80 

NoUscSer 

0.06 

0.07 

SendrDone 

0.05 

0.16 

VbeExl 

0.03 

0.04 
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Table  17.  Comparison  of  Implementation  area  and  design  time  of 
the  BDD  SAT-Circuit  solver  (with  SAT  formula  partitioning  pre¬ 
processing)  and  Lavagno  et  aVs  technique  for  practical  asynchro¬ 
nous  circuit  benchmarks  on  a  SUN  SPARC-2  workstation.  Time 
unit;  second. 


Benchmark 

Name 

Benchmark  \ 

1  BDD  SAT  Solver 

Lavagno  and  Moon  et  at.  [329] 

Initial 
no.  of 
states 

Initial 
no.  of 
signals 

Final 
no.  of 
signal 

Circuit 

Area 

(literals) 

CPU 

time 

sec. 

Final 
no.  of 
signal 

Circuit 

Area 

(literals) 

CPU 
-  time 

sec. 

MrO 

302 

11 

15 

41 

58.36 

13 

86 

1084.5 

Mmul 

82 

8 

10 

38 

28.16 

10 

37 

47.8 

SbufRamWr 

58 

10 

12 

47 

32.79 

12 

35 

54.6 

Vbe4a 

58 

6 

8 

30 

1.95 

8 

41 

5.5 

NakPa 

56 

9 

10 

25 

0.53 

10 

41 

20.8 

RamRdSbuf 

36 

10 

11 

25 

0.25 

11 

23 

65.2 

SbufSndPkt2 

24 

6 

7 

21 

0.37 

7 

14 

8.6 

SbufSndCtI 

21 

6 

7 

17 

0.37 

8 

43 

3.4 

AtoD 

20 

6 

8 

30 

18.27 

7 

19 

2.9 

Pa 

20 

6 

7 

14 

0.15 

Internal  State  Error 

WrData 

16 

4 

5 

18 

0.05 

5 

21 

0.9 

Fifo 

16 

4 

5 

15 

0.14 

5 

15 

0.7 

SbufRdCtl 

14 

6 

7 

16 

0.05 

7 

15 

1.5 

NoUsc 

12 

3 

4 

12 

0.09 

4 

14 

0.5 

VbeEx2 

12 

3 

4 

12 

0.09 

4 

21 

0.5 

NoUscSer 

8 

2 

4 

18 

3.94 
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implementation  area  than  Lavagno  et  aVs  algorithm  for  almost  all  the  circuits  in 
the  benchmark  set  [329].  Lavagno  et  aVs  method  yields  a  total  area  of  449  literals 
in  1298.5  seconds.  In  comparison,  for  the  same  benchmarks,  the  BDD  SAT  solver 
achieved  an  area  of  379  literals  in  145.7  seconds.  In  addition,  Lavagno  et  aVs 
method  was  unable  to  solve  some  benchmark  circuits,  such  as  Pa  and  AlexNonFc, 
These  results  show  that,  as  compared  to  existing  techniques,  the  BDD  SAT  solver 
is  capable  of  achieving  an  average  of  20%  reduction  in  implementation  area  for 
all  the  benchmarks.  According  to  critical  industrial  evaluations,  this  BDD  SAT 
solver  offers  a  practical  solution  for  complex  industrial  asynchronous  circuit  design 
problems. 


14.  Applications 

Practical  application  problems  are  the  driving  forces  for  SAT  research.  They 
provide  the  ultimate  benchmarks  to  test  SAT  algorithms  and  techniques.  An  effec¬ 
tive  SAT  algorithm  in  one  application  problem  will  shed  light  on  solving  problems 
in  other  application  areas. 

The  SAT  problem  has  direct  applications  in  mathematical  logic,  artificial  intel¬ 
ligence,  VLSI  engineering,  and  computing  theory.  It  also  has  indirect  applications 
through  other  transferable  problems,  e.g.,  constraint  satisfaction  problems  and  con¬ 
strained  optimization  problems  [226].  Due  to  the  UniSAT models,  some  application 
problems  in  the  real  space  are  related  to  SAT  as  well.  In  the  following,  we  list  some 
applications  that  can  be  formulated  as  solved  as  instances  of  SAT. 
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Mathematics:  finding  n-ary  relations  such  as  transitive  closure  [67],  detect¬ 
ing  graph  and  subgraph  isomorphisms  [105,  370,  372,  396,  439,  516,  555], 
the  graph  coloring  problem  [57,  242,  366,  372],  mathematical  cryptology 
[405,  444],  the  automata  homomorphism  problem  [198],  finding  spanning 
trees  and  Euler  tours  in  a  graph  [393],  solving  the  traveling  salesman  prob¬ 
lem  [286,  287,  330,  397],  and  logical  arithmetic  [93]. 

Computer  science  and  artificial  intelligence:  the  constraint  satisfaction  prob¬ 
lem  [13,  191,  206,  358,  448],  the  n-queens  problem  [191,  241,  482],  ex¬ 
tended  inference  [22],  logical  programming  [96,  98,  139,  327],  abductive 
inference  for  synthesizing  composite  hypotheses  [295],  semantic  information 
processing  [22,  161,  394],  puzzles  and  cryptoarithmetic  [189,  241,  274, 
364,  365,  391],  truth  maintenance  [122,  124,  127,  138,  368],  produc¬ 
tion  system  [277,  378,  379],  the  soma  cube  and  instant  insanity  problem 
[191],  theorem  proving  [268,  314,  424,  554],  and  neural  network  comput¬ 
ing  [13,  14,  129,  250,  351,  271]. 

Machine  vision:  image  matching  problem  [22,  88,  447,  550],  line  and 
edge  labeling  problems  [76,  159,  512,  538,  558],  stereopsis,  scene  analysis 
and  semantics-based  region  growing  [22,  76,  158,  159,  160,  512,  538], 
the  shape  and  object  matching  problem  [67,  115,  246],  syntactic  shape 
analysis  [116,  245,  308],  shape  from  shading  problem  [12,  50,  173,  261, 
262,  264,  263,  273,  361,  409],  and  image  restoration  [193]. 

Robotics:  related  vision  problem  [88,  272],  packing  problem  [133],  and 
trajectory  and  task  planning  problems  [46,  152]. 

Computer-aided  manufacturing:  task  planning  [390],  design  [388,  389], 
solid  modeling,  configuring  task  [174],  design  cellular  manufacturing  system, 
scheduling  [164,  353],  and  3-dimensional  object  recognition  [229,  263]. 
Database  systems:  operations  on  objects  [515,  518],  database  consistency 
maintenance,  query-answering  and  redundancy-checking,  query  optimiza¬ 
tion  [78,  515],  concurrency  control  [31,  154,  357],  distributed  database 
systems  [185],  truth  and  belief  maintenance  [122,  124,  127,  138,  368], 
the  relational  homomorphism  problem  [241,  515],  and  knowledge  organiza¬ 
tion  for  recognition  system  [243]. 

Text  processing:  optical  character  recognition  [90,  384,  499],  character  con¬ 
straint  graph  model  [269],  printed  text  recognition  [21,  269],  handwritten 
text  recognition  [480],  automatic  correction  of  errors  in  text  [517]. 
Computer  graphics:  construction  of  2-dimensional  pictures  and  3-dimensional 
graphical  objects  from  constraints,  reasoning  of  the  geometrical  features  of 
3-dimensional  objects  [55,  180]. 

Integrated  circuit  design  automation:  circuit  modeling  [75,  506],  logic  mini¬ 
mization  [253],  state  assignment  [526,  527],  state  minimization  [204,  438], 
asynchronous  circuit  synthesis  [223,  435,  434,  436],  I/O  encoding  for  se¬ 
quential  machines  [455],  power  dissipation  estimation  [135],  logic  partition¬ 
ing  [85,  143,  325,  395,  453],  circuit  layout  and  placement  [11,  36,  47,  97, 
112,  134,  233,  491],  scheduling  and  high-level  synthesis  [48,  323,  406], 
pin  assignment  [45,  452],  floorplanning  [415,  500],  interconnection  analy¬ 
sis  [141,  142],  routing  [1,  71,  131,  132,  232,  333,  404,  441,  445,  467], 
compaction  [68,  140,  244,  304,  324,  344,  460,  477,  524],  performance 
optimization  [298,  315,  363,  450,  500],  testing  and  test  generation  [136, 
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279,  326,  146,  416],  and  verification  [313,  502,  513].  Please  also  see:  Jun 
Gu,  Satisfiability  Problems  in  VLSI  Engineering,  1996, 

•  Computer  architecture  design:  instruction  set  optimization  [4,  114,  205, 
282,  433,  437],  computer  controller  optimization  [27,  33,  305,  355,  432, 
438],  arithmetic  logic  circuit  design  [77],  compiler  system  optimization  [7, 
343],  scheduling  [37,  137,  186,  187,  230,  336],  fault-tolerant  computing 
[24,  260,  20],  task  partitioning  and  assignment  [39,  40,  111,  275,  478], 
load  balancing  [347,  392,  557],  real  time  systems  [254,  281,  316,  494, 
495,  504],  data  flow  consistency  analysis  [7],  data  module  assignment  in 
memory  system  [7],  and  parallel  and  distributed  processing  [440,  465]. 

•  High-speed  networking:  contact  the  authors. 

•  Communications:  contact  the  authors. 

•  Security:  contact  the  authors. 

In  other  areas  such  as  industrial  (chemical,  transportation,  construction,  nuclear) 
engineering,  management,  medical  research,  social  sciences,  there  are  numerous 
SAT  /CSP  applications. 


15.  Future  Work 

A  number  of  future  research  directions  for  the  satisfiability  problem  have  been 
discussed  recently.  They  are  further  emphasized  in  the  1996  DIMACS  satisfiability 
workshop. 

General  Boolean  Expressions  and  Evaluation.  Many  practical  applica¬ 
tion  problems  are  expressed  as  Boolean  satisfiability  problems  by  a  compact  set  of 
general  Boolean  functions.  Although  the  transformation  of  a  general  Boolean  ex¬ 
pression  into  CNF  can  be  done  in  polynomial  time,  it  will  result  in  a  substantially 
larger  clause-form  representation  [192,  412].  While  this  may  not  be  critical  in 
complexity  theory,  it  will  have  serious  impact  on  the  time  to  solve  these  problems. 
To  this  end,  efficient  representation  and  manipulation  of  general  Boolean  functions 
is  crucial  to  solving  practical  application  problems. 

Theoretical  Issues.  Recent  research  on  SAT  has  brought  up  some  interesting 
theoretical  problems,  such  as  the  average  time  complexity  analysis  [25,  212,  228, 
268,  362,  420],  determining  satisfiable-unsatisfiable  boundary  [109,  307,  380], 
global  convergence  and  local  convergence  rate  [216,  227],  and  the  structure  and 
hardness  of  input  models  [102,  170,  196].  Some  of  the  problems,  e.g.,  the  average 
time  complexity  analysis,  are  extremely  difficult  [335].  So  far  only  some  preliminary 
efforts  based  on  simplified  assumptions  were  given  [49,  220,  216,  227]. 

One  of  the  recent  efforts  to  solve  SAT  formulas  is  to  find  subclasses  for  which 
the  problem  is  solvable  in  polynomial  time  [153,  184].  Future  work  in  this  direction 
aims  at  building  hierarchies  of  formulae  classes,  analyzing  the  properties  of  such 
hierarchies,  and  qualitative  evaluation  of  the  hierarchies. 

SAT  Algorithm  Development.  The  development  of  new  algorithms  and 
improved  techniques  for  satisfiability  testing  has  been  a  long-term  effort  of  the 
research  community  and  the  industry.  From  computation/efficiency  point  of  view, 
specific  data  structures  and  implementation  details  of  SAT  algorithms  are  crucial. 
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The  algorithm  space  shows  a  number  of  asymmetrical  and  irregular  places,  implying 
further  opportunity  for  new  SAT  algorithm  development. 

From  an  experimental  point  of  view,  it  is  difficult  to  find  a  super  algorithm 
that  performs  well  for  a  wide  range  of  SAT  instances.  Existing  SAT  algorithms 
complement  rather  than  exclude  each  other  by  being  effective  for  particular  problem 
instances.  One  of  the  future  directions  is  to  continue  the  development  of  the  Multi- 
5AT  algorithm,  integrating  different  algorithms  using  a  cluster  of  computers  [219] 
(Section  11.8).  Computer  hardware  and  memory  space  are  becoming  increasingly 
inexpensive.  If  one  can  trade  hardware  for  improved  performance,  it  can  show  a 
promising  approach  (in  fact,  trading  memory  space  for  speed  was  a  basic  design 
philosophy  behind  the  RISC  computer  architectures). 

For  important  practical  applications,  there  may  be  significant  problem  domain 
information.  Efficient  SAT  algorithms  may  be  developed  by  exploring  input-  and 
application-specific  structures  (Section  11.5).  Specialized  algorithms  tailored  to 
particular  applications,  on  the  other  hand,  do  provide  key  insights  to  general  sat- 
isfiability  testing. 

Practical  Application  Case  Study.  It  has  been  recognized  by  SAT  re¬ 
searchers  that  practical  application  problems  are  the  driving  forces  for  SAT  re¬ 
search;  they  are  the  ultimate  benchmarks  to  test  SAT  algorithms.  This  direction 
was  further  addressed  by  the  NSF,  the  advisory  committee,  and  the  organizing 
committee  of  the  1996  DIMACS  Satisfiability  workshop  [147,  288,  289].  There 
has  been  a  strong  relationship  between  theory,  algorithms,  and  applications  of  SAT. 
A  major  step  in  the  future  is  to  bring  together  theorists,  algorithmists,  and  prac¬ 
titioners  working  on  SAT  and  on  industrial  applications  involving  SAT,  enhanc¬ 
ing  the  interaction  between  the  three  research  groups.  It  would  be  beneficial  to 
research  community  and  to  industry  if  we  can  apply  theoretical  and  algorithmic 
results  on  SAT  to  practical  problems,  while  taking  these  practical  problems  for 
further  theoretical/algorithmic  study.  In  addition  to  theoretical/ algorithmic  study, 
in  the  future,  we  will  also  further  concentrate  on  significant  industrial  case  studies 
of  SAT,  practical  applications  of  SAT  algorithms,  and  practical  and  industrial  SAT 
benchmarks. 

Parallel  Algorithms  and  Architectures.  Implementing  an  algorithm  on 
VLSI  hardware  architectures  is  a  common  practice  to  speed  up  algorithm  execution. 
Not  only  does  it  offer  faster  execution  speed,  certain  sequential  portions  of  the 
algorithm  may  be  implemented  in  hardware  architectures  in  parallel  form.  For  SAT 
per  5e,  it  has  certain  granularity  at  the  search  tree  level,  clause  level,  and  variable 
level  that  lend  itself  well  to  parallel  processing.  A  number  of  parallel  algorithms  and 
architectures  for  solving  SAT  have  been  developed  and  have  been  found  to  perform 
well  at  different  levels  of  granularity.  Two  basic  approaches  have  been  taken  in 
this  direction:  implementing  parallel  SAT  inference  algorithms  on  special-purpose 
VLSI  chips  [224,  225],  and  implementing  tightly-coupled,  parallel  SAT  algorithms 
on  existing  sequential  computer  machines  [207,  212,  212,  490,  489]. 

Algorithm  Engineering  Approach.  Aho,  Johnson,  Karp,  Kosaraju,  Mc- 
Geoch,  Papadimitriou,  and  Pevzner  have  recently  proposed  an  algorithm  engineer¬ 
ing  approach  for  the  experimental  testing  of  algorithms  [8].  They  believe  that 
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“Within  theoretical  computer  science  algorithms  are  usually  studied  within  highly 
simplified  models  of  computation  and  evaluated  by  metrics  such  as  their  asymp¬ 
totic  worst-case  running  time  or  their  competitive  ratio.  These  metrics  can  be 
indicative  of  how  algorithms  are  likely  to  perform  in  practice,  but  they  are  not 
sufficiently  accurate  to  predict  actual  performance.  The  situation  can  be  improved 
by  using  models  that  take  into  account  more  details  of  system  architecture  and 
factors  such  as  data  movement  and  interprocessor  communication,  but  even  then 
considerable  experimentation  and  fine-tuning  is  typically  required  to  get-  the  most 
out  of  a  theoretical  idea.  Efforts  must  be  made  to  ensure  that  promising 
algorithms  discovered  by  the  theory  community  are  implemented,  tested 
and  refined  to  the  point  where  they  can  be  usefully  applied  in  practice.” 


16.  Conclusions 

The  SAT  problem  is  at  the  core  of  the  class  of  NP-complete  problems  and  has 
many  practical  applications.  In  recent  years,  many  optimization  methods,  parallel 
algorithms,  and  practical  techniques  have  been  developed  for  solving  the  SAT  prob¬ 
lem.  The  past  two  decades  have  seen  the  proliferation  of  many  SAT  algorithms:  res¬ 
olution,  local  search,  global  optimization,  BDD  SAT  solver,  and  multispace  search, 
among  others.  Existing  methods  complement  rather  than  exclude  each  other  by 
being  effective  for  particular  instances  of  SAT.  In  this  survey,  we  present  a  general 
algorithm  space  that  integrates  existing  SAT  algorithms  into  a  unified  perspective. 
We  describe  several  major  classes  of  SAT  algorithms  with  the  emphasis  on  intro¬ 
ducing  recent  advances  in  SAT  algorithms.  We  gave  performance  evaluation  of 
some  existing  SAT  algorithms.  This  survey  also  provides  a  set  of  practical  applica¬ 
tions  of  SAT.  The  area  of  SAT  research  is  a  rich  land  of  well- developed  theory  and 
methods.  To  apply  theoretical/algorithmic  results  to  practical  problems  seems  the 
ultimate  way  to  test  and  benchmark  SAT  algorithms.  Not  only  will  the  end  results 
of  such  an  endeavor  have  a  major  scientific/ industrial  impact,  but  in  the  process  it 
will  push  optimization  technology  to  its  limit. 
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